Vocabulary Adaptation
The Vocab Adaptation feature in our VoiceAI platform is a crucial tool for customizing speech recognition to meet your specific vocabulary requirements. It significantly improves transcription accuracy by adapting to specialized terminology, industry-specific jargon, or unique brand names prevalent in your business operations.
For example, in scenarios where your business frequently uses unique product names such as 'ZyntriQix', specialized terms like 'QuadraCore', or industry-specific references like 'BioSynthetix', the Vocab Adaptation feature is trained to recognize and accurately transcribe these terms. This is particularly beneficial for ensuring the correct transcription of rare proper nouns, technical phrases, or any specialized language that is integral to your industry, ensuring your transcripts reflect the precision and specificity of your spoken dialogue.
File Transcription Job with Vocabulary Adaptation
- API
- Python SDK
Copy and paste the below curl request on your terminal to start a transcription using the API. Fill the variables with the appropriate values, as mentioned in the overview.
curl --location 'https://voice.neuralspace.ai/api/v2/jobs' \
--header 'Authorization: {{API_KEY}}' \
--form 'files=@"{{LOCAL_AUDIO_FILE_PATH}}"' \
--form 'config="{\"file_transcription\":{\"mode\":\"{{MODE}}\"}, \
\"dictionary\":{\"words\":[\"{{WORD_1}}\", \"{{WORD_2}}\" ...]}
}"'
In the above request, you can pass your custom vocabulary in the form of a comma-seperated list to the dictionary
parameter. The list can consist of words or very short phrases.
Once installation steps for the package are complete, execute the below mentioned python code snippet:
import neuralspace as ns
vai = ns.VoiceAI()
# or,
# vai = ns.VoiceAI(api_key='YOUR_API_KEY')
# Setup job configuration
config = {
"file_transcription": {
},
"dictionary": {
"words": [
WORD_1,
WORD_2,
...
]
}
}
# Create a new file transcription job
job_id = vai.transcribe(file='path/to/audio.wav', config=config)
print(job_id)
In the above snippet, you can pass your custom vocabulary in the form of a list to the dictionary
parameter. The list can consist of words or very short phrases.
The response looks like:
6abe4f35-8220-4981-95c7-3b040d9b86d1
Fetch Transcription Results
- API
- Python SDK
When you pass the jobId
(received in response to the transcription API) to the API below, it fetches the status and results of the job.
curl --location 'https://voice.neuralspace.ai/api/v2/jobs/{{jobId}}' \
--header 'Authorization: {{API_KEY}}'
Response of the request is similar to the default file transcription request. The result of the vocabulary adaptation using the passed words is reflected in the accuracy of the transcript. Overall structure of the response looks like below:
{
"success": true,
"message": "Data fetched successfully",
"data": {
"timestamp": 1704720760507,
"filename": "english_audio_sample.mp3",
"jobId": "b96acb3c-b672-4b59-9e24-6c40fd095219",
"params": {
"file_transcription": {
"language_id": "en",
"mode": "fast"
},
"dictionary": ["word_1", "word_2"],
},
"status": "Completed",
"audioDuration": 131.568,
"messsage": "",
"progress": [
"Queued",
"Started",
"Transcription Started",
"Transcription Completed",
"Completed"
],
"result": {
"transcription": {
"channels": {
"0": {
"transcript": "We've been at this for hours now. Have you found anything useful in any of those books? Not a single thing, Lewis. I'm sure that there must be something in this library...",
"timestamps": [
{
"word": "We've",
"start": 6.65,
"end": 6.99,
"conf": 0.8
},
{
"word": "been",
"start": 6.99,
"end": 7.09,
"conf": 0.99
},
...
]
}
}
}
}
}
}
Using the jobId
received in result of above request, the snippet below can be executed to fetch the status and results of the job.
# Check the job's status
# If job is complete, you will directly get the output.
result = vai.get_job_status(job_id)
print(f'Current status:\n{result}')
# This should finish in a minute for the sample audio used here.
# It will depend on the duration of the audio file and other config options.
print('Waiting for completion...')
result = vai.poll_until_complete(job_id)
print(result)
Response of the request is similar to the default file transcription request. The result of the vocabulary adaptation using the passed words is reflected in the transcript of the audio. Overall structure of the response looks like below:
Current status:
{
"success": true,
"message": "Data fetched successfully",
"data": {
"timestamp": 1704720760507,
"filename": "english_audio_sample.mp3",
"jobId": "b96acb3c-b672-4b59-9e24-6c40fd095219",
"params": {
"file_transcription": {
"language_id": "en",
"mode": "fast"
},
"dictionary": ["word_1", "word_2"],
},
"status": "Completed",
"audioDuration": 131.568,
"messsage": "",
"progress": [
"Queued",
"Started",
"Transcription Started",
"Transcription Completed",
"Completed"
],
"result": {
"transcription": {
"channels": {
"0": {
"transcript": "We've been at this for hours now. Have you found anything useful in any of those books? Not a single thing, Lewis. I'm sure that there must be something in this library...",
"timestamps": [
{
"word": "We've",
"start": 6.65,
"end": 6.99,
"conf": 0.8
},
{
"word": "been",
"start": 6.99,
"end": 7.09,
"conf": 0.99
},
...
]
}
}
}
}
}
}
Custom Dictionary
While dealing with a particular domain, you might have a long list of words that you wish are correctly identified in the transcript. Instead of passing the list of words in every request, VoiceAI provides the feature of creating and saving custom dictionaries that can also be used later.
Creating a Custom Dictionary
You can simply call the following API to create a dictionary:
- API
- Python SDK
curl --location 'https://voice-dev.neuralspace.ai/api/v2/dicts' \
--header 'Authorization: {{API_KEY}}' \
--header 'Content-Type: application/json' \
--data '{
"name": "Sample",
"words": ["{{WORD_1}}", "{{WORD_2}} ..."]
}'
import neuralspace as ns
vai = ns.VoiceAI()
# or,
# vai = ns.VoiceAI(api_key='YOUR_API_KEY')
words = [WORD_1, WORD_2 ...]
# Create a new dictionary of cutom vocabulary
result = vai.create_custom_dict(name='Sample', words=words)
print(result)
The response to the above request will look as mentioned below:
- API
- Python SDK
{"success":true,"message":"Dictionary created successfully","data":{"id":"659d2d30a9faf500129d9ca6"}}
{"success":true,"message":"Dictionary created successfully","data":{"id":"659d2d30a9faf500129d9ca6"}}
The response consists of the the status message, and the dictionary id
, which can later be used to reference this particular dictionary.
Creating a File Transcription Job Using Existing Custom Dictionary
Using the dictionary id
that was received earlier, we can pass the existing dictionary to file transcription job as follows:
- API
- Python SDK
curl --location 'https://voice.neuralspace.ai/api/v2/jobs' \
--header 'Authorization: {{API_KEY}}' \
--form 'files=@"{{LOCAL_AUDIO_FILE_PATH}}"' \
--form 'config="{\"file_transcription\":{\"mode\":\"{{MODE}}\"}, \
\"dictionary\":{\"id\":\"{{DICTIONARY_ID}}\"}
}"'
import neuralspace as ns
vai = ns.VoiceAI()
# or,
# vai = ns.VoiceAI(api_key='YOUR_API_KEY')
# Setup job configuration
config = {
"file_transcription": {
},
"dictionary": {
"id": "DICTIONARY_ID"
}
}
# Create a new file transcription job
job_id = vai.transcribe(file='path/to/audio.wav', config=config)
print(job_id)
Response of the above request will be exactly the same as mentioned above.
Dictionaries once created can be updated or deleted as well. You can also fetch the words present in a dictionary or all dictionaries that you have created. For more information on how to use these APIs, check out the API Reference.