📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
A recent whitepaper by Google emphasizes that the core of AI development is not the model size but the surrounding harness and context engineering. This shift impacts how organizations approach AI integration, emphasizing configuration and verification over model improvements.
A new whitepaper by Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the model accounts for only about 10% of an AI system’s behavior. Instead, the harness and context engineering surrounding the model determine 90% of performance and reliability. This insight shifts the focus of AI development away from chasing larger models toward refining system configuration, verification, and control mechanisms.
The whitepaper, titled The New SDLC With Vibe Coding, underscores that the dominant part of AI system behavior lies in the harness — including prompts, rules, tools, and observability — which constitutes roughly 90% of the system. The model itself is only a small component, responsible for about 10%, yet it often receives disproportionate attention.
Concrete evidence supports this claim: experiments on public benchmarks, such as Terminal Bench 2.0, showed that changing only the harness or prompts significantly improved performance, even with the same underlying model. For example, one team moved a coding agent into the top 5 by adjusting only the harness, not the model. This indicates that configuration and setup are critical to AI success.
The whitepaper also emphasizes that cost management in AI is more about optimizing the harness and context than about acquiring larger models. It argues that ad-hoc prompting and vibe coding are less efficient long-term, as they lead to higher token usage, maintenance, and security risks. Instead, disciplined approaches like agentic engineering — combining structured context, verification, and tooling — offer better economics and reliability.
The model is only 10%
A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.
The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.
Implications for AI Development Strategies
This shift in understanding has major implications for organizations integrating AI. It suggests that investing in system architecture, configuration, and verification processes yields greater returns than simply upgrading to larger models. Leaders should focus on building robust harnesses and quality control mechanisms to achieve better performance, security, and cost-efficiency in AI applications.
By recognizing that the model is only 10% of the equation, companies can reallocate resources toward developing better tooling, testing, and context management, ultimately gaining a durable competitive advantage in AI deployment.

MUCAR 892BT AI-Assisted Bidirectional Scan Tool, Full System OBD2 Scanner, Bi-Directional OBD2 Scanner Diagnostic Tool,ECU Coding, 35 Services, FCA Autoauth, CANFD and DOIP, Free Lifetime Upgrade
【Powerful Performance】: OBD2 scanner, featuring an 8-inch ultra-large display, the MUCAR 892BT runs on Android 10 with a…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Evolution of AI Development Practices
Historically, AI progress has been driven by larger models and more training data. However, recent industry experiments and benchmarks have shown diminishing returns from model size alone. The whitepaper builds on this trend, emphasizing that the surrounding system — prompts, rules, tools, and observability — is where effective control and reliability are achieved.
This perspective aligns with ongoing shifts in AI engineering, where disciplined system design and verification are increasingly prioritized. It also reflects broader industry moves toward cost-effective AI, as token economy and operational costs become critical considerations.
“The model is only 10% of what determines behavior; the harness and context are 90%.”
— Addy Osmani

AI Engineering: Building Applications with Foundation Models
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Uncertainties in Model Versus Harness Impact
While the whitepaper presents strong experimental evidence, it remains to be seen how universally applicable these findings are across different AI applications and industries. The precise quantification of the 10% versus 90% split may vary depending on use case and system complexity. Additionally, the long-term impact of focusing primarily on harness and context engineering is still being evaluated in real-world deployments.

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Next Steps for Organizations Adopting AI
Organizations should reassess their AI development priorities, emphasizing the design of robust harnesses, context management, and verification processes. Future research and industry practice are likely to focus on developing standardized frameworks for system configuration and tooling. Companies that adapt quickly by investing in these areas may achieve better performance, security, and cost savings in AI deployment.

AI Prompt Engineering: Foundations of Communication with LLMs – Building Generative AI and Agentic AI Prompt Systems Across Development, Testing, and Deployment (AI Engineering)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Why is the model only 10% of the AI system’s behavior?
The whitepaper shows that most of an AI system’s performance depends on how the model is integrated, controlled, and verified through prompts, rules, and tooling, which constitutes about 90% of the system’s effectiveness.
How does this shift affect AI development costs?
Focusing on harness and context engineering can reduce long-term costs by improving efficiency, security, and reliability, even if initial setup costs are higher due to system design and testing.
What should companies do differently based on this insight?
They should prioritize building robust system configurations, verification processes, and tooling around AI models, rather than solely investing in larger or more advanced models.
Is this perspective applicable to all AI applications?
While the findings are supported by experiments, applicability may vary across different domains. Organizations should evaluate the importance of harness and context engineering within their specific use cases.
What is agentic engineering?
Agentic engineering involves designing AI systems with structured context, verification, and tooling that enable reliable and cost-effective operation, moving beyond simple prompt-based interactions.
Source: ThorstenMeyerAI.com