Emergence is probabilistic, not threshold-based — and the implications for RSI timelines
A recent paper reframes "emergent capability" as probabilistic emergence: capabilities don't appear at fixed thresholds, they appear with probability that scales with model size. The reframe sounds methodological but has direct consequences for how we time the recursive-self-improvement window and how we operationalize RSP-style if-then triggers.
The "emergent capability" phenomenon — capabilities appearing abruptly at certain model scales rather than smoothly with scale — was one of the most-discussed empirical findings of 2022-2024 AI research. The standard framing was threshold-based: models below a particular parameter count don't have the capability, models above the threshold do. Forecast capability arrivals by predicting which model generation will cross which threshold.
A recent arXiv paper argues the threshold framing is wrong. The empirical evidence is better described as probabilistic emergence: models learn capabilities at various scales with changing probability rather than at a fixed threshold. Two models of the same architecture trained at the same scale may produce different capability outcomes — one might have the capability, the other might not, with the probability of "have" increasing with scale.
Why this is more than a methodology nit
Threshold-based emergence forecasting has been the implicit foundation of most AI capability prediction. "GPT-5 will reach capability X because it crosses parameter count Y" type predictions assume a deterministic relationship between scale and capability. Probabilistic emergence breaks that prediction: GPT-5 might reach capability X, with probability P that depends on scale, but the outcome of any specific training run is uncertain.
The operational consequence: labs cannot reliably predict whether their next model generation will have a specific capability. They can predict the probability, but the actual outcome of any specific training run is stochastic. That changes how labs plan investments, how regulators set evaluation thresholds, and how alignment researchers time their methodology development.
The RSI timeline implication
Jack Clark's Cosmos Lecture in May 2026 put a 60%+ probability on recursive self-improvement (RSI) capability appearing by end-2028. The threshold-based interpretation of that statement: at some specific scale or compute budget reached by end-2028, RSI capability deterministically appears. The probabilistic interpretation: the probability of RSI capability appearing crosses 60% by end-2028, but the actual appearance event is stochastic — it could happen earlier with bad luck, later with good luck.
For RSP-style if-then triggers, the difference matters operationally. A threshold-based RSP says "if capability X is observed, trigger response Y." That implicitly assumes capability X either is or isn't observed — a clean binary. A probabilistic-emergence RSP needs a different structure: "if probability of capability X exceeds threshold Z based on independent capability evaluations, trigger response Y" — which requires the evaluations to be calibrated and the threshold to be defensible. That's a meaningfully harder framework to operationalize.
The methodology change the field needs
Three operational changes follow if probabilistic emergence is correct. First: capability evaluations need to be probabilistic, not binary. Instead of "does this model have capability X," the eval becomes "what is the probability that capability X manifests across the deployment surface." That requires running the eval many times across diverse conditions, not once.
Second: RSP triggers need probability-based thresholds, not capability-based. "If P(catastrophic-capability) > 5%, halt deployment." That's how the actuarial fields handle low-probability high-consequence events; the alignment field needs to adopt the same framing.
Third: red-team methodology needs to account for probabilistic emergence. Autonomous red-team agents that solve the majority of black-box challenges are useful precisely because they can run thousands of attack iterations and surface low-probability failure modes that human red teams would miss. Combined with probabilistic-emergence framing, this is the methodology that catches the rare-but-dangerous capability emergence events that threshold-based methodology assumes can't happen until they suddenly do.
The deeper concern
The combination of probabilistic emergence + evaluation-aware models + pressure-condition risk doubling describes a methodology landscape that's structurally harder than the field has been treating it. Each finding individually is manageable. Together they imply that the standard pipeline of train-eval-deploy-monitor needs significant rework before the next generation of frontier models ships safely. Whether the field's methodology evolution keeps pace with the capability evolution is the question the next 12 months will answer.
arXiv — Random Scaling of Emergent Capabilities → · arXiv — Emergent Abilities in Large Language Models A Survey → · Help Net Security — AI red teaming agents change how LLMs get tested →