OpenAI ships GPT-Realtime-2 with GPT-5-class reasoning in the voice API
OpenAI released a new family of realtime voice models including GPT-Realtime-2, with reasoning, tool use, interruption handling, and context windows up to 128K tokens. The models top Big Bench Audio and Conversational Dynamics scores, and accompany GPT-Translate and GPT-Whisper updates. Latency improvements come from a rebuilt WebRTC stack OpenAI also documented this week.
Voice agents that can actually reason inside a single turn change the design of contact center and field-service deployments. Teams still building two-model pipelines (ASR plus LLM plus TTS) should re-baseline cost and latency before committing to another year of glue code.