Real-time Speech transcription and translation has never been easier

Spread the love

Keeping up with the speech services you will notice a few nice additions to the speech service. Besides the previous features like Text-To-Speech or Speech-To-Text (tutorials for both can be found here), the speech service has expanded to several new cool features out-of-the-box! Speech-To-Text and Speech translation are now real-time helping in several business scenarios, like call centre analysis and conversation transcription.

Take a look at the complete list of the updated Speech services here:

Service	Feature	Description	SDK	REST
Speech-to-Text	Speech-to-text	Speech-to-text transcribes audio streams to text in real-time that your applications, tools, or devices can consume or display. Use speech-to-text with Language Understanding (LUIS) to derive user intents from transcribed speech and act on voice commands.	Yes	Yes
	Batch Transcription	Batch transcription enables asynchronous speech-to-text transcription of large volumes of data. This is a REST-based service, which uses the same endpoint as customization and model management.	No	Yes
	Conversation Transcription	Enables real-time speech recognition, speaker identification, and diarization. It’s perfect for transcribing in-person meetings with the ability to distinguish speakers.	Yes	No
	Create Custom Speech Models	If you are using speech-to-text for recognition and transcription in a unique environment, you can create and train custom acoustic, language, and pronunciation models to address ambient noise or industry-specific vocabulary.	No	Yes
Text-to-Speech	Text-to-speech	Text-to-speech converts input text into human-like synthesized speech using Speech Synthesis Markup Language (SSML). Choose from standard voices and neural voices (see Language support).	Yes	Yes
	Create Custom Voices	Create custom voice fonts unique to your brand or product.	No	Yes
Speech Translation	Speech translation	Speech translation enables the real-time, multi-language translation of speech to your applications, tools, and devices. Use this service for speech-to-speech and speech-to-text translation.	Yes	No
Voice-first Virtual Assistants	Voice-first virtual assistants	Custom virtual assistants using Azure Speech Services empower developers to create natural, human-like conversational interfaces for their applications and experiences. The Bot Framework’s Direct Line Speech channel enhances these capabilities by providing a coordinated, orchestrated entry point to a compatible bot that enables voice in, voice out interaction with low latency and high reliability.	Yes	No

Source: Microsoft Docs

Find the documentation here for Conversation Transcription, Call Center Transcription, and Voice-first Virtual Assistants and stay tuned for more tutorials!

Enter your email Address

Real-time Speech transcription and translation has never been easier

Real-time Speech transcription and translation has never been easier

Georgia Kalyva

Leave a Reply Cancel reply