// news · research-papers · agents2026-05-25source: openai / techjacks / phys.org

Four AI math milestones in 30 days plus LIMO reasoning efficiency — the research-automation thesis has empirical anchors now, not just speculation

Between April 21 and May 25, 2026, the AI mathematics research automation thesis has accumulated five concrete empirical anchors: AlphaEvolve's production records, FrontierMath Tier 4, WorldReasonBench, OpenAI's Erdős disproof with Gowers companion paper, and now the LIMO efficiency reproduction. The cumulative case for AI as research infrastructure is no longer speculative.

The five-milestone cumulative count is the strongest version of the argument. Through 2024 "AI can do mathematics" was a press-release claim accompanied by skepticism from working mathematicians. April-May 2026 produced: (1) AlphaEvolve achieving novel construction records on CDCL solver kernels and matrix multiplication tilings, (2) Google DeepMind's FrontierMath Tier 4 graduate-research-level solve, (3) WorldReasonBench's video-reasoning benchmark from a non-Google team, (4) OpenAI's disproof of the 80-year-old Erdős unit-distance conjecture with Fields medalist Tim Gowers writing the endorsement companion paper, and (5) LIMO reproducing at frontier scale.

Five independent results from five different labs in 35 days is a phase transition, not a coincidence. The relevant question is no longer whether AI can do research-grade mathematics; it's how fast the trajectory accelerates from here. Jack Clark's Cosmos Lecture put a 60%+ probability on recursive self-improvement by end-2028; the math milestones are exactly the data points that probability is conditioned on. If the same pace continues for the next 12 months — say a milestone per week across other research surfaces (chemistry, biology, materials science) — the research-automation thesis transitions from "AI is becoming research infrastructure" to "AI is the dominant research engine across multiple domains." The next twelve months will tell us whether that pace is sustainable.

See our analysis →

Tech Jacks Solutions — AI Math Reasoning Milestones 30 Days → · Phys.org — AI breakthrough in math problem decades → · arXiv — LIMO Less is More for Reasoning →