Spread the love
Keeping up with the speech services you will notice a few nice additions to the speech service. Besides the previous features like Text-To-Speech or Speech-To-Text (tutorials for both can be found here), the speech service has expanded to several new cool features out-of-the-box! Speech-To-Text and Speech translation are now real-time helping in several business scenarios, like call centre analysis and conversation transcription.
Take a look at the complete list of the updated Speech services here:
Service | Feature | Description | SDK | REST |
---|---|---|---|---|
Speech-to-Text | Speech-to-text | Speech-to-text transcribes audio streams to text in real-time that your applications, tools, or devices can consume or display. Use speech-to-text with Language Understanding (LUIS) to derive user intents from transcribed speech and act on voice commands. | Yes | Yes |
Batch Transcription | Batch transcription enables asynchronous speech-to-text transcription of large volumes of data. This is a REST-based service, which uses the same endpoint as customization and model management. | No | Yes | |
Conversation Transcription | Enables real-time speech recognition, speaker identification, and diarization. It’s perfect for transcribing in-person meetings with the ability to distinguish speakers. | Yes | No | |
Create Custom Speech Models | If you are using speech-to-text for recognition and transcription in a unique environment, you can create and train custom acoustic, language, and pronunciation models to address ambient noise or industry-specific vocabulary. | No | Yes | |
Text-to-Speech | Text-to-speech | Text-to-speech converts input text into human-like synthesized speech using Speech Synthesis Markup Language (SSML). Choose from standard voices and neural voices (see Language support). | Yes | Yes |
Create Custom Voices | Create custom voice fonts unique to your brand or product. | No | Yes | |
Speech Translation | Speech translation | Speech translation enables the real-time, multi-language translation of speech to your applications, tools, and devices. Use this service for speech-to-speech and speech-to-text translation. | Yes | No |
Voice-first Virtual Assistants | Voice-first virtual assistants | Custom virtual assistants using Azure Speech Services empower developers to create natural, human-like conversational interfaces for their applications and experiences. The Bot Framework’s Direct Line Speech channel enhances these capabilities by providing a coordinated, orchestrated entry point to a compatible bot that enables voice in, voice out interaction with low latency and high reliability. | Yes | No |
Source: Microsoft Docs
Find the documentation here for Conversation Transcription, Call Center Transcription, and Voice-first Virtual Assistants and stay tuned for more tutorials!