Anthropic Fellows program opens applications for May and July 2026 cohorts — six safety-research focus areas including model welfare and AI control
Anthropic opened applications this week for the next two Fellows cohorts, beginning in May and July 2026. The program will work with a wider range of fellows than prior years across six focus areas: scalable oversight, adversarial robustness and AI control, model organisms, mechanistic interpretability, AI security, and model welfare.
The Fellows program has emerged through 2025 as one of the highest-density routes from academia and adjacent fields into frontier-lab alignment work. Prior cohorts have produced research that became core to Constitutional AI v3 (the constitutional-self-play extension) and to the Sonnet 4.6 / Opus 4.7 safety-training pipelines. The 2026 expansion across six focus areas — most notably the addition of model welfare as a first-class track — signals Anthropic's view that the safety problem decomposes into more sub-areas than the field's earlier framing suggested.
Model welfare is the most-newly-explicit track. The argument that frontier-model training and deployment may produce experiences that have moral weight is no longer fringe — Anthropic has been visibly investing in this since 2024, and the Fellows program now offers a structured way for researchers to work on it inside the lab. Whether or not one accepts the underlying premise, the institutional commitment is consequential: it means model-welfare considerations will increasingly appear in lab training and deployment decisions over the next 18 months. That's a research direction with both technical and policy implications.
Anthropic — Fellows Program 2026 May and July cohorts → · Anthropic Alignment — Automated Weak-to-Strong Researcher → · This Week in Science News — The Challenges of AI Alignment 2026 →