Skip to main content

Text to Speech

Text-to-Speech (TTS) technology transforms written text into spoken words. Our TTS service provides a realistic, human-like voice output, making it an essential tool not only for personal accessibility but also for enhancing user experience in various digital platforms.

It is pivotal in creating digital avatars and voice bots that offer more engaging and interactive user experiences. In the realm of customer service, TTS powers bots for call centers, enabling efficient and human-like interactions for queries and support. It's also instrumental in developing voice assistants and smart home devices, facilitating convenient voice commands and information retrieval. Additionally, TTS is used in content creation for voiceovers and narration, providing a cost-effective and efficient solution for producing diverse audio content.

Prerequisites

Make sure to follow Get Started to sign up and have all the necessary pre-requisites. If you are using the APIs or the SDK, save your API key in a variable called NS_API_KEY before moving ahead. Do this by using the command below

export NS_API_KEY=YOUR_API_KEY

Refer to the Voices page for speaker IDs of available voices in different languages.

Start a Speech Synthesis Job

Copy and paste the below mentioned curl requeston your terminal to generate audio from your provided text using the API. Fill the variables with the appropriate values.

curl --location 'https://voice.neuralspace.ai/api/v1/tts' \
--header "Authorization: $NS_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"text": "مرحبا بالعالم",
"speaker_id": "ar-male-Omar-saudi-neutral"
}'
Data ParametersRequiredDescription
textYesYour text that you want to synthesize into speech.
speaker_idYesSpeaker ID (can be obtained from voices)
More Configurations

Apart from the required configurations that have been passed in the example above, we support more optional configurations as well as mentioned below. This can be passed inside the data dictionary. Please refer to the API Reference for more details on how to pass them in the request.

"data": {
...
"stream": True,
"config": {
"pace": 1.0,
"volume": 1.0,
"pitch_shift": 0,
"pitch_scale": 1.0
}
}
  • stream: Enable streaming to directly get the audio generated as bytes instead of a file download link.
  • config: Control the pace, volume, and pitch of the generated audio, through their respective parameters.

When a request is sent via the curl command or SDK code snippet above, it returns the generated audio as a file download link along with other details of the job, and error message, if any. An example response is given below. This is when stream is set to False. When it is True, only a byte array is returned.

{
"success": true,
"message": "Job created successfully",
"data": {
"jobId": "b2d4bcb2-f7a6-453d-84eb-00f796f23880",
"timestamp": 1701765394885,
"result": {
"save_path": "https://largefilestoreprod.blob.core.windows.net/common/uploads/6272df27-81a6-442a-bb7a-f98b63243604"
}
}
}