📊 Full opportunity report: AMÁLIA · The Three Hard Questions. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
Portugal launched AMÁLIA, a €5.5M European Portuguese LLM, which outperforms many models but prompts three critical questions about openness, native data, and objectives. These issues impact national AI strategies across Europe.
Portugal’s €5.5 million investment in the AMÁLIA large language model has resulted in a functioning European Portuguese LLM that outperforms previous open models on key benchmarks, but critical questions about its openness, native data sufficiency, and strategic goals remain unresolved.
Developed by a consortium of approximately 60 researchers from Portugal’s leading institutions, AMÁLIA is based on a continuation of the EuroLLM model, with the base version completed in September 2025 and publicly released in October. It currently serves 450,000 academic users via the FCT’s IAedu platform, holding knowledge up to the end of 2023.
Technical analysis indicates that AMÁLIA is not trained from scratch but extends an existing multilingual foundation, with only about 5.8 billion tokens from Portuguese sources during extended pre-training, representing roughly 5.5% of the total. Despite outperforming many models on Portuguese benchmarks, it still trails Qwen 3-8B on the primary ALBA benchmark, a point not emphasized in the official report but noted by critics like Duarte O.Carmo.
The project exemplifies the broader European sovereign-LLM movement, which involves multiple countries pursuing native-language models with varying strategies. However, the discourse often focuses on individual model launches rather than the structural questions these efforts raise at a policy level.
AMÁLIA
The three hard
questions.
Portugal spent €5.5M to build a European Portuguese LLM. The base version is operational, the benchmarks beat Qwen 3-8B on most pt-PT tasks. So why are the most important questions still unanswered?
Last month, Duarte O.Carmo published the sharpest public analysis of AMÁLIA — Portugal’s state-funded European Portuguese large language model. He prefaces his critique with the necessary diplomatic apparatus before doing what almost nobody else in the European-sovereign-LLM discourse has been willing to do publicly: asking hard questions about whether the work, as released, actually does what it set out to do. This piece is a structural extension of his analysis. The AMÁLIA case study exposes three hard questions every national LLM effort needs to answer publicly — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
Three questions every national LLM effort needs to answer publicly.
Duarte O.Carmo’s framing maps cleanly onto the structural argument. Each question lands specifically in AMÁLIA — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
The three questions form a structural feedback loop. Q3 (optimization target) determines Q2 (data volume needed) which conditions Q1 (openness sufficient for community contribution). The European sovereign-LLM movement collectively benefits from these questions becoming standard methodology disclosure, not exceptional critique.

Official Jetson AGX Orin 64GB Developer Kit 275 Tops, with 1TB SSD AI Embodied Intelligence Development Provides AI Large Models Deploying Openclaw
AGX Orin 64GB Development Kit makes it easy to get started with AGX Orin. Its compact size, rich…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
107 billion tokens. 5.8 billion clearly pt-PT.
The structurally tractable question with a structurally surprising answer. For a model whose entire stated purpose is European Portuguese prioritization, the native-language share of extended pre-training is 5.5%. The implications cascade into every other question.

Fine-Tuning Large Language Models: From Custom Datasets to High-Performance AI Models Using Modern Toolchains
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
The Olmo standard. AMÁLIA’s current state.
Allen Institute for AI’s Olmo project defines what “fully open” operationally requires. Olmo doesn’t lead frontier benchmarks. That’s not the point. The point is to be the structural reference for openness. AMÁLIA’s “fully open source” claim should track to the operational standard.

Teaching Computers to Read
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Four strategic positions. AMÁLIA between two and three.
Approximately €100M+ in publicly disclosed European sovereign-LLM funding across the major initiatives. The structural question every project faces: what is the actual competitive position you’re staking? Four options — none mutually exclusive — but each requiring different commitments.

Portuguese for Beginners: Practical Learning with SynapseLingo (Learn Portuguese)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three standards. For AMÁLIA and the movement.
The structural critique generalizes beyond AMÁLIA. Italy, France, Germany, Switzerland, the OpenEuroLLM consortium, and every subsequent national project benefit from public discourse holding national LLM efforts to operational standards on openness, data accounting, and strategic positioning.
The European sovereign-AI agenda is a serious strategic project that deserves serious public discourse. O.Carmo’s analysis is what serious public discourse looks like. Appropriately diplomatic. Structurally rigorous. Willing to ask the hard questions in public when the public investment justifies it. More of this is needed — across every European sovereign-LLM project, not just AMÁLIA.
Implications of the Three Critical Questions for European AI
The development of AMÁLIA highlights fundamental issues facing European national AI initiatives: how open are these models truly? How much native-language data is enough to achieve meaningful performance? And what should be the primary goals—openness, performance, or strategic sovereignty? Addressing these questions is essential for shaping effective, accountable AI policies across Europe, especially as multiple nations pursue similar projects with public funding.
Unanswered or partially addressed, these questions influence not only the technical development of models like AMÁLIA but also the broader strategic landscape, including issues of transparency, data sovereignty, and international competitiveness. The answers will determine how European models evolve and how their deployment impacts societal trust and policy frameworks.
European Sovereign-LLM Efforts and the Structural Dilemma
Across Europe, countries like Italy, Germany, France, and Norway are investing in native-language large language models, often with public funds. These initiatives, including Italy’s Minerva and France’s Mistral, are at similar stages of development, facing shared questions about openness, native data, and strategic objectives. The European OpenEuroLLM consortium exemplifies collective efforts, but the discourse tends to focus on individual model launches rather than the underlying structural challenges.
Portugal’s AMÁLIA serves as a case study because of its public funding and national scope, making the critical questions about model openness, native data sufficiency, and purpose central to national policy debates. These issues are not yet fully addressed publicly, leaving a gap in strategic clarity for policymakers and researchers alike.
“The three questions—how open is ‘fully open,’ how much native data is enough, and what should we optimize for—are at the heart of understanding the true potential and limitations of European sovereign-LLMs.”
— Duarte O.Carmo
Unresolved Questions About Model Openness and Goals
It is not yet clear how open AMÁLIA truly is, given the technical and strategic choices made during development. The extent to which native Portuguese data suffices for future improvements remains uncertain, as does the prioritization of openness versus performance or strategic sovereignty. The final version due in June 2026 may address some of these gaps, but current details are incomplete.
Next Milestones for AMÁLIA and European Sovereign Models
The final version of AMÁLIA, expected in June 2026, will likely clarify some of the current uncertainties, including performance benchmarks and openness parameters. Over the next 12-24 months, further evaluations, policy debates, and possibly more transparent disclosures about native data use and model accessibility are anticipated. European nations will continue to refine their strategies amid ongoing discussions about sovereignty, openness, and strategic goals in AI development.
Key Questions
What are the main concerns about AMÁLIA’s openness?
Critics question whether AMÁLIA is truly open, given its reliance on a continuation of a multilingual foundation and limited native Portuguese data during training. The specifics of its accessibility and transparency are still being evaluated.
How does AMÁLIA compare to other European models?
AMÁLIA outperforms many open models on Portuguese benchmarks and beats Qwen 3-8B on most tests, but it still trails Qwen on the primary ALBA benchmark. Its development approach differs from models trained from scratch, emphasizing continuation rather than foundational training.
Why do these questions matter for European AI policy?
Addressing openness, native data sufficiency, and strategic goals is crucial for building trustworthy, effective, and sovereign AI systems. These factors influence transparency, data sovereignty, and Europe’s competitive position in AI technology.
What are the risks of not answering these questions openly?
Without transparent answers, European AI efforts risk losing public trust, facing strategic vulnerabilities, and missing opportunities for responsible innovation aligned with societal values.
What will happen after the final AMÁLIA release?
Post-release, focus will likely shift to evaluating the model’s real-world performance, transparency measures, and strategic alignment. Policymakers and researchers will debate the next steps for native-language models across Europe.
Source: ThorstenMeyerAI.com