// news · alignment · safety2026-05-17source: anthropic

Constitutional Classifiers now live in Claude production stack

The Constitutional Classifiers technique from the May 16 paper has been deployed in the Claude 4.5 production stack, with Anthropic reporting near-elimination of standard jailbreak attempts on the public API.

Deployment was rolled out gradually over April and early May 2026, with no announcement until the paper landed. The lack of preannouncement is deliberate — Anthropic wanted to measure real-world jailbreak attempts against the classifiers before publishing the technique.

Some sophisticated jailbreaks still succeed but the bar has moved meaningfully. Industry-wide adoption is likely fast given the open paper + the visible production track record.

Anthropic news → · Anthropic research →