📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google emphasizes that in AI-assisted software development, the model itself is only a small part of the system. The real value lies in the harness and context engineering, which dominate system performance and cost.

A new whitepaper from Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the AI model accounts for only about 10% of the behavior in AI-assisted development systems. The key takeaway is that the harness and context engineering constitute the remaining 90%, shifting the focus from model improvements to configuration and design, which has major implications for development costs and strategies.

The whitepaper, titled The New SDLC With Vibe Coding, argues that the dominant factor in the effectiveness and reliability of AI coding agents is not the underlying model, but the harness — the prompts, tools, rules, and observability layers surrounding the model. Evidence from benchmark experiments shows that changing the harness can significantly improve performance, even when using the same model.

Furthermore, the paper emphasizes that context engineering — the way information, instructions, and tools are loaded into the agent — is more crucial than prompt engineering alone. The authors recommend structuring context dynamically and loading only necessary skills, which allows for scalable, cost-effective AI development.

Importantly, the paper challenges the common focus on acquiring the latest models, suggesting that organizations should invest more in developing robust harnesses and context management, as these components are where durable competitive advantages can be built.

At a glance
reportWhen: published March 2026
The developmentGoogle’s new whitepaper highlights that in AI coding, the model is only 10% of the system; the majority of influence comes from harness and context engineering.
The Model Is Only 10% — The New SDLC With Vibe Coding
A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Development Strategies

This shift in understanding impacts how organizations should allocate resources in AI development. Since the harness and context management determine most of the system’s behavior and cost, focusing on configuration, tooling, and structured context can lead to more reliable, scalable, and cost-efficient AI systems. It also suggests that improvements in models alone will have diminishing returns compared to optimizing the surrounding infrastructure.

For CTOs and developers, this means reevaluating investment priorities, emphasizing system architecture, tooling, and verification, rather than solely chasing the latest model releases. It also underscores the importance of building internal expertise in context engineering and system configuration.

Amazon

AI harness configuration tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background of AI System Design and the Model’s Role

Prior to this development, the common narrative in AI development centered on acquiring and deploying the most advanced models, with the assumption that model quality directly correlated with system performance. However, recent experiments and industry observations have shown that the same model, when integrated with different harnesses, can produce vastly different results. The whitepaper builds on this understanding, highlighting that the total system behavior depends heavily on how the model is embedded within the broader architecture.

This perspective aligns with ongoing industry trends toward modular, configurable AI systems, where the focus is on creating reusable, well-designed scaffolding around models to achieve desired outcomes efficiently.

“The model is only 10% of what determines system behavior; the harness is the other 90%.”

— Addy Osmani

AI Context Engineering: Architecting Intelligence Through Prompt Structures, Tools, and Memory

AI Context Engineering: Architecting Intelligence Through Prompt Structures, Tools, and Memory

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Impact

While the whitepaper presents compelling evidence that harness and context are critical, it does not specify precise methodologies for optimal harness design or how organizations can best transition from model-centric to configuration-centric development. The long-term impact on AI innovation and competitive advantage remains to be seen, especially as models continue to evolve rapidly.

Additionally, it is unclear how these insights will influence industry standards and whether new tools will emerge to facilitate better harness and context engineering at scale.

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Teams and Industry Adoption

Organizations should begin evaluating their current AI workflows, focusing on how they configure and manage context around models. Developing internal expertise in harness design and system architecture will be critical. Industry groups and tool vendors may also start releasing new frameworks and best practices to support this paradigm shift.

Further research and experimentation are expected to clarify optimal strategies for harness construction and context management, potentially leading to new standards and tools in AI development.

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper argues that the harness and context engineering — prompts, tools, rules, and observability layers — have a much larger influence on how the system performs than the model itself.

How should organizations change their AI development focus?

They should prioritize designing robust harnesses and managing context dynamically, rather than solely investing in acquiring the latest models.

What are the risks of focusing too much on models?

Overemphasizing models can lead to diminishing returns, higher costs, and less control over system behavior, which can be mitigated by investing in system configuration and verification.

Does this mean models are no longer important?

Models remain important, but their role is now seen as part of a larger system. The whitepaper emphasizes that the surrounding infrastructure determines most of the system’s effectiveness.

What is the main takeaway for AI practitioners?

The key lesson is to focus on system architecture, tooling, and context management to unlock the full potential of AI systems, rather than only chasing model improvements.

Source: ThorstenMeyerAI.com

You May Also Like

The Switch: You Never Owned the AI You Depend On

Recent events reveal AI access can be revoked instantly by governments or companies, exposing dependency risks and ownership illusions in AI deployment.

Forezai · TradingAgents: A Trading Firm Made of Agents

Forezai introduces TradingAgents, a multi-agent research framework mimicking trading desk organization, emphasizing structured disagreement and oversight.

The Door: Why the Interface Is Worth More Than the Model

SpaceX’s $60 billion purchase of a coding interface highlights the growing importance of user interfaces over models in AI distribution and control.

7 Best Gaming Laptop Prime Day Deals for 2026

Discover the best gaming laptop deals for Prime Day 2026, featuring balanced power, display quality, and discounts on top models like MSI Katana 17 and Lenovo Legion Pro 7i.