OpenAI releases GPT-Realtime, AI agent enters the era of ultra-realistic conversations.

date
29/08/2025
OpenAI releases the speech model GPT-realtime. GPT-realtime is a multimodal model dedicated to speech AI agents, capable of generating more natural and fluent speech, perfectly imitating the diverse tones, emotions, and speech speeds of humans. It supports image understanding and combines it with speech or text dialogue, making it very suitable for creating intelligent speech in fields such as customer service, education, finance, and healthcare. GPT-realtime also adds two unique voices, Marin and Cedar, and has fully upgraded the existing 8 voices.