// news · models · UX2026-05-11source: thinkingmachines.ai

Mira Murati's Thinking Machines unveils "interaction models" — 0.4-second full-duplex AI

Former OpenAI CTO's startup announces TML-Interaction-Small: a model designed to handle voice, video, and text simultaneously, respond in 0.40 seconds, and interrupt mid-sentence rather than waiting for turns.

Thinking Machines Lab announced its first public technology on May 11. The framing the company is using — "interaction models" rather than "language models" — is more than rebranding. The flagship system, TML-Interaction-Small, handles audio, video, and text concurrently and responds at ~0.40 seconds of latency. Per the company's demos, it listens while it talks and can jump into conversations the way a person would.

This is research preview; a limited release is planned for partners in the coming months with a wider public release later this year. The company is reportedly in talks for funding at around a $50 billion valuation.

The technical bet: turn-taking is the wrong abstraction for human-AI interaction. If true, it forces a different architecture all the way down — input encoders, generation, and the inference runtime have to handle continuous concurrent signal rather than discrete prompt/response cycles.

Thinking Machines Lab → · Semafor coverage →