📊 Full opportunity report: Engineering Is Automated. Research Is the Residual. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
AI systems now automate most core engineering tasks in AI development, reaching near-saturation. However, AI’s ability to fully automate research processes remains uncertain, leaving a residual human role. This shift could significantly impact AI R&D workflows.
Recent evidence indicates that AI systems can now automate the majority of core engineering tasks involved in AI research and development, reaching near-saturation levels across multiple benchmarks. Meanwhile, the automation of AI research itself remains incomplete, with some aspects still reliant on human creativity and insight. This development marks a significant shift in the landscape of AI R&D, with potential implications for how research is conducted and organized.
According to Thorsten Meyer’s analysis of recent benchmarks, AI has achieved near-complete automation in core engineering skills relevant to AI development. For example, the CORE-Bench, which measures research reproduction capabilities, has seen performance improve from 21.5% in September 2024 to 95.5% by December 2025, with the benchmark’s author stating it is ‘solved.’ Similarly, the MLE-Bench, assessing Kaggle competition performance, rose from 16.9% in October 2024 to 64.4% in February 2026, approaching professional-level performance.
These benchmarks cover tasks like reproducing research papers, optimizing machine learning models, and designing GPU kernels. The pattern across these measures indicates that AI can automate large parts of engineering processes involved in AI R&D, reducing the marginal cost and friction traditionally associated with these activities. However, the same analysis suggests that research—defined as the creative and hypothesis-driven exploration—may be fundamentally different from engineering automation.
Thorsten Meyer emphasizes that while engineering tasks are increasingly handled by AI, the residual research component—such as formulating new hypotheses, conceptual breakthroughs, and strategic innovation—may still require human input. The structural question remains whether research itself is becoming a form of engineering at scale, which could accelerate automation beyond current expectations.
Engineering is automated.
Research is the residual.
Six skill benchmarks. Edison’s framing. The question Clark leaves open is whether research is just engineering at scale.
Jack Clark’s Import AI #455 catalogs six benchmarks measuring AI capability on AI R&D tasks and concludes “AI can today automate vast swatches, perhaps the entirety, of AI engineering.” The residual question is research. The structural read on the residual: it may not be a permanent moat.
Six skills. One trajectory.
Clark catalogs six benchmarks measuring AI capability on AI R&D-relevant tasks. Each individual benchmark could be noise. Six benchmarks moving together is a curve. The pattern is the cascade observed across the broader Clark series — visible here in the specific R&D-skill domain.
![Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results](https://m.media-amazon.com/images/I/415+fSJacsL._SL500_.jpg)
Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three data points. Mixed signal.
Clark provides three data points on the creative-spark question. Yes-evidence: Erdős-1051, centaur math discovery, sporadic Move-37-style moments. No-evidence: low yield, framing dependence, absence of acceleration. The mixed signal is the honest read.
The data supports two readings. Pessimistic: rare moments suggest creative insight is qualitatively distinct from engineering work. Optimistic: rare moments are an artifact of low-volume exploration; more shots on goal yields more discoveries. Both readings are consistent with Clark’s “vast swatches, perhaps the entirety” claim. They differ on the residual.

CUDA and GPU Parallel Computing Engineering: Accelerating Scientific and High-Performance Workloads Through CUDA Kernels, Memory Optimization, and Multi-GPU Scaling
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five dimensions Clark gestures at but leaves underdeveloped.
Clark’s section is rigorous on the empirical evidence. Five strategic dimensions matter for the institutional response that the Clark series synthesis argues is structurally inadequate.
machine learning model training hardware
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Two readings. Different equilibria.
The structural question Clark leaves open: is research a permanent moat that bounds automated AI R&D, or is it engineering at scale that dissolves with more shots on goal? Both readings are consistent with the current data. They differ by orders of magnitude in consequences.
Productivity multiplier years
Recursive loop operational
research automation software for AI
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five audiences. Asymmetric cost of being wrong.
The institutional response should not bet on inspiration being a permanent moat. If the distinction holds, capacity built is still useful. If it closes, capacity is necessary. Asymmetric cost-of-being-wrong points toward building now.
IN INDUSTRY
IN ACADEMIA
POLICYMAKERS
INVESTORS
EVERYONE ELSE
Engineering is automated. The residual is the question. The institutional response should not bet on inspiration being a permanent moat.
Implications of Engineering Automation for AI R&D
The near-complete automation of core engineering tasks in AI development suggests a paradigm shift in how AI research is conducted. Organizations may see reduced costs, faster iteration cycles, and increased scalability of AI projects. However, the remaining human role in research—particularly in creative and strategic aspects—raises questions about the future division of labor and the potential for AI to fully automate the entire research process. This could lead to a restructuring of AI R&D workflows and organizational models, with significant impacts on employment, innovation pace, and scientific discovery.
Recent Advances in AI Engineering Capabilities
Over the past two years, multiple benchmarks and research initiatives have demonstrated rapid progress in AI’s engineering skills. The CORE-Bench, measuring research reproduction, improved from 21.5% to 95.5% in 15 months; the MLE-Bench, assessing Kaggle competition performance, increased from 16.9% to 64.4% over the same period. Concurrently, research papers have detailed advances in kernel design, code optimization, and infrastructure automation, indicating that AI is transitioning from experimental to production-grade engineering.
This pattern of rapid progress across diverse technical domains suggests that AI is approaching a point where engineering tasks are effectively automated, shifting the bottleneck toward research innovation. The development aligns with broader theories that AI’s capabilities are reaching a ‘coding singularity,’ where the engineering component becomes largely self-sufficient.
“The pattern across multiple benchmarks indicates AI can automate vast swaths of engineering work, perhaps the entirety, leaving research as the residual human domain.”
— Thorsten Meyer
Unresolved Questions About AI-Driven Research
It remains unclear whether AI can fully automate the creative and hypothesis-generating aspects of research. While engineering tasks are approaching automation saturation, the extent to which AI can replace human intuition, strategic insight, and conceptual innovation is still under debate. Additionally, the impact of this automation on scientific progress, organizational structures, and employment in research roles is not yet fully understood.
Next Steps for AI R&D and Organizational Adaptation
Researchers and organizations are likely to focus on developing benchmarks and tools to measure the full scope of research automation. Monitoring how AI handles strategic, hypothesis-driven tasks will be critical. Policy discussions around workforce implications and organizational restructuring are expected to intensify as AI approaches human-level capabilities in research activities. Further technological advances over the next 32 months will clarify whether research automation can match engineering automation.
Key Questions
What specific engineering tasks has AI automated?
AI has automated tasks such as reproducing research papers, optimizing machine learning models, designing GPU kernels, and infrastructure automation, reaching near-complete performance levels in benchmarks.
Does this mean AI can now do all research work?
No, while engineering tasks are approaching full automation, the creative, hypothesis-driven aspects of research remain less certain and are likely still reliant on human insight.
What are the implications for AI research organizations?
Organizations may experience reduced costs and faster development cycles, but will need to address how to integrate human creativity with automated engineering workflows.
Will AI replace human researchers entirely?
It is currently unclear if AI can fully replace human researchers, especially in generating novel ideas and strategic directions, which are less automatable.
What should organizations do to prepare for this shift?
They should invest in developing tools to measure research automation, adapt workflows to incorporate AI-driven engineering, and consider workforce implications.
Source: ThorstenMeyerAI.com