Mira Murati says every AI lab built interaction wrong

Today's signal

Thinking Machines Lab, the AI startup founded by former OpenAI CTO Mira Murati, today released a research preview of what it calls Interaction Models — a new class of AI built from scratch for real-time, multimodal collaboration. Watch the introduction here. Unlike every major AI model today, it does not wait for you to finish speaking before it starts processing. It listens, watches, and responds simultaneously, across audio, video, and text, with a turn-taking latency of 0.40 seconds.

Why it matters

Every frontier lab, including OpenAI, Anthropic, and Google, has built their interaction layers as an afterthought — a harness bolted on top of the underlying model. Thinking Machines built the interaction capability into the model architecture itself, which means as the model scales, so does its ability to collaborate in real time. The research paper published alongside the release directly cites an Anthropic model card to make its case: Anthropic's own documentation acknowledges that their model underperforms when used in a synchronous, hands-on-keyboard pattern. That is not a minor footnote; it is Murati using the industry's own admissions as evidence. The TML-Interaction-Small model, a 276-billion parameter Mixture-of-Experts architecture with 12 billion active parameters, benchmarks at 0.40 seconds turn-taking latency against 0.57 seconds for Gemini-3.1-flash-live and 1.18 seconds for GPT-realtime-2.0. On FD-bench, the interaction quality benchmark, it scored 77.8 versus GPT-realtime-2.0's 46.8.

The take

The entire AI industry has been optimizing for autonomy — the model does the work while you walk away. Murati is making a different bet: that the most valuable AI is the one that keeps you in the loop, not the one that replaces you. Whether or not the benchmarks hold up under real enterprise conditions, the architecture argument is sound. You cannot bolt on real-time collaboration after the fact. Every lab that has tried has ended up with latency workarounds and scaffolding hacks. If Thinking Machines is right about this, then every major model in production today is built on a structural flaw. See the demos here.

The number

3x. On FD-bench V1.5, TML-Interaction-Small scored 77.8 on interaction quality — nearly three times the score of GPT-realtime-2.0 minimal, which clocked in at 46.8. This is the first time a startup's model has outperformed both OpenAI and Google on a real-time interaction benchmark.

Read the full breakdown → Analytics Drift

Mira Murati says every AI lab built interaction wrong

Reply

Keep Reading

Drift