// news · research-papers · agents2026-05-23source: deepmind / arxiv / hackernews

DeepMind publishes Amplifier — recursive self-improvement architecture for code generation, 67% gain over baseline on SWE-bench Verified

Google DeepMind published the Amplifier paper this week, describing a recursive self-improvement architecture for code-generation agents that achieves a 67% gain over a Gemini 2.5 Pro baseline on SWE-bench Verified. The architecture combines an actor model, a critic model, and a learned reward model that updates from execution feedback at training time.

The 67% gain is the largest single architecture-level result on SWE-bench Verified since the original AutoCodeRover and SWE-agent papers in 2024. DeepMind's Amplifier reaches 0.81 SWE-bench Verified at the largest configuration — above every reported closed-flagship baseline, including Claude 4.7 Opus's 0.69 and GPT-5's 0.71.

The methodological contribution is the execution-feedback reward model. The critic model takes the actor's generated patch, runs it against the test suite in an isolated sandbox, and updates the reward model from the resulting trace. The update happens at training time on millions of synthetic-bug instances — the architecture learns to predict which patches will pass before they're tried, which is what closes the 67% gap.

See our analysis →

DeepMind — Amplifier release blog → · arXiv — Amplifier: Recursive Self-Improvement for Code Generation → · HackerNews — DeepMind Amplifier discussion thread →