// news · research-papers · frontier-models2026-05-25source: arxiv / claude5 / matsprogram

"Random Scaling of Emergent Capabilities" paper reframes emergence as probabilistic — capabilities don't appear at thresholds, they appear with probability that scales with model size

A recent arXiv paper argues that the "emergent capability" phenomenon — capabilities appearing abruptly at certain model scales — is better described as probabilistic emergence: models learn capabilities at various scales with changing probability rather than at a fixed threshold. The reframe has implications for how labs predict and plan for capability jumps in future model generations.

The threshold-vs-probability distinction matters operationally. The threshold framing (which dominated 2022-2024 emergence research) predicts that scaling a model past a specific parameter count or compute budget will reliably produce a new capability. The probability framing predicts that scaling increases the likelihood of the capability emerging, but specific training runs at the same scale may or may not produce it. The two predictions diverge most clearly at the frontier where labs are spending billions of dollars on training runs and need to forecast capability outcomes.

The implication for the recursive-self-improvement timeline (Jack Clark's Cosmos Lecture 60%+ probability by end-2028, covered in earlier cycles) is the part with strategic weight. Threshold-based emergence forecasting predicts RSI capability appears reliably above a specific scale — meaning the timeline is deterministic once the scale is hit. Probability-based emergence forecasting predicts RSI capability appears with some probability that increases with scale — meaning timelines are stochastic and labs may need multiple training runs at the same scale before achieving the capability. The latter framing makes the alignment-research clock harder to set because we don't know exactly when the capability will appear, only that it becomes more likely. For RSP-style if-then triggers, that uncertainty is genuinely difficult to operationalize.

See our analysis →

arXiv — Random Scaling of Emergent Capabilities → · arXiv — Emergent Abilities in Large Language Models A Survey → · arXiv — Predicting Emergent Capabilities by Finetuning →