// news · alignment · biosecurity2026-05-20source: ai safety report

International AI Safety Report: OpenAI o3 outperforms 94% of domain experts on virology lab protocols

The International AI Safety Report 2026 cites OpenAI's o3 outperforming 94% of domain experts at troubleshooting virology lab protocols. That capability now exists in deployed frontier models — and is the specific basis for the biosecurity risk-amplifier concern driving CAISI's pre-deployment testing regime.

The 94% number is the policy-shaping figure. It is precise enough to argue with, high enough to be alarming, and tied to a concrete capability (lab protocol troubleshooting) that anyone can recognize as dual-use. Expect this single statistic to be cited in every AI biosecurity hearing through 2026.

The mitigation question is unsolved. Downstream content filters work poorly here because the capability is not "knows dangerous facts" but "can reason about lab setups" — the same reasoning serves legitimate research. Anthropic's and OpenAI's joint-evaluation summer work is the first public attempt to converge on benchmark methodology rather than mitigation method.

Biosecurity Handbook — AI risk amplifier → · OpenAI/Anthropic joint safety evaluation →