Integration: Microsoft Azure Text-To-Speech

Unity SDK / Integration Guides / Microsoft Azure TTS



Currently this integration uses our Amazon Polly visemes to animate the didimo. Don't forget to generate your didimos with support for Amazon Polly animation. See Built for Amazon Polly

You can use Microsoft's Azure Text-To-Speech solution and have your didimos animated to say anything you'd like. Find more about this service here.

First, you are required to install the Speech SDK in Unity.

We've prepared an integration scene to teach you how to generate your Text-To-Speech files inside our Unity SDK. The scene also showcases the component used to playback the speech and animations on a didimo.

This integration is one of our Core Samples - find the Core package in the Package Manager of your project and import the Sample Azure TTS Integration. After importing it, start by opening up the Microsoft Azure TTS Integration scene and find the AzureTTSManager GameObject in the Hierarchy.

Azure TTS Creation

This is the TTS creation component, you can use it to generate the Text-To-Speech files required for animation of character: the audio clip and the animation .json file.

  1. Start by filling in your API connection details, so you can access the Azure API.

  2. Specify a file name: both the animation and audio files will be generated with this name. Choose a path to save these files.

  3. Choose your desired Creation Mode: either using a String or a SSML file.
    i. If you choose the "Create From String" mode, just fill in the sentence you want and choose the Azure voice, which is Jenny by default. Visit the voices list to choose the one you prefer.
    ii. If you choose the "Create From SSML" method, you just need to specify the path to your SSML file. Please visit this page to understand your possibilities when creating the SSML file.

  4. Press "Create TTS Files" and they will be generated in the chosen paths.


Creation component in Create From String mode.

Azure TTS Playback

This is the TTS Playback component, you can use it to playback the files you generated or use the Stream Mode to animate a didimo in Runtime, without having to save any files.

  1. Start by dragging into the Didimo Components field the didimo you want to use. Choose an audio source as well.

  2. Choose your desired Playback Mode: either from files or using the Stream Mode, both SSML and from a String.
    i. In File Mode, please drag in your animation and audio files.
    ii. In Stream Mode: start by filling in the API Connection Details. After this the process is the same, if String, fill in the text and the voice name, if SSML, fill in the path to the file.

  3. Enter Play mode.

  4. Press the "Playback TTS" button.


Playback component in File mode.