Updates to Gemini 2.5 from Google DeepMind

New Gemini 2.5 capabilities

Native audio output and enhancements to Stay API

At the moment, the Stay API is introducing a preview model of audio-visual enter and native audio out dialogue, so you possibly can immediately construct conversational experiences, with a extra pure and expressive Gemini.

It additionally permits the person to steer its tone, accent and magnificence of talking. For instance, you possibly can inform the mannequin to make use of a dramatic voice when telling a narrative. And it helps device use, to have the ability to search in your behalf.

You possibly can experiment with a set of early options, together with:

Affective Dialogue, by which the mannequin detects emotion within the person’s voice and responds appropriately.
Proactive Audio, by which the mannequin will ignore background conversations and know when to reply.
Pondering within the Stay API, by which the mannequin leverages Gemini’s pondering capabilities to assist extra advanced duties.

We’re additionally releasing new previews for text-to-speech in 2.5 Professional and a couple of.5 Flash. These have first-of-its-kind assist for a number of audio system, enabling text-to-speech with two voices by way of native audio out.

Like Native Audio dialogue, text-to-speech is expressive, and may seize actually delicate nuances, corresponding to whispers. It really works in over 24 languages and seamlessly switches between them.