Subtitle Guidelines
The VoiceAI platform excels in creating subtitles for various video content, including movies and TV shows. While transforming spoken dialogue into text is a crucial step, raw transcription often doesn't suffice for effective subtitles. Subtitles require adherence to certain standards for optimal viewer readability and comprehension. To address this need, VoiceAI has a customizable subtitle guideline feature. This allows users to tailor key aspects of their subtitles for a superior viewing experience. You can set parameters such as the maximum number of lines in a subtitle, the ideal duration each subtitle should be displayed based on the audio length, and the maximum character count per line. These configurations ensure your subtitles are not only accurate but also viewer-friendly, enhancing the overall accessibility of your content.
File Transcription Job with Subtitle Guidelines
- API
- Python SDK
Copy and paste the below curl request on your terminal to start a transcription using the API. Fill the variables with the appropriate values, as mentioned in the overview.
curl --location 'https://voice.neuralspace.ai/api/v2/jobs' \
--header 'Authorization: {{API_KEY}}' \
--form 'files=@"{{LOCAL_AUDIO_FILE_PATH}}"' \
--form 'config="{\"speaker_diarization\":{},\"subtitles_guidelines\":{\"line_count\": {{LINES}},\"duration\":{{DURATION}},\"character_count\":{{CHARACTER_COUNT}}}"'
In the above request, the line_count
, duration
, and character_count
parameters need to passed to set the guidelines. It also returns a response similar to the regular file transcription API as seen in overview.
{
"success": true,
"message": "Job created successfully",
"data": {
"jobId": "281f8662-cdc3-4c76-82d0-e7d14af52c46"
}
}
Once installation steps for the package are complete, execute the below mentioned python code snippet:
import neuralspace as ns
vai = ns.VoiceAI()
# or,
# vai = ns.VoiceAI(api_key='YOUR_API_KEY')
# Setup job configuration
config = {
"speaker_diarization": {},
"subtitle_guidelines": {
"line_count": {MODE},
"duration": {DURATION},
"character_count": {CHARACTER_COUNT}
},
}
# Create a new file transcription job
job_id = vai.transcribe(file='path/to/audio.wav', config=config)
print(job_id)
In the above request, the line_count
, duration
, and character_count
parameters need to passed to set the guidelines. It also returns a response similar to the regular file transcription API as seen in overview.
The response looks similar as well:
6abe4f35-8220-4981-95c7-3b040d9b86d1
To use the subtitle guidelines feature, you must have the speaker diarization feature enabled as well.
Fetch Transcription and Subtitle Guideline Results
- API
- Python SDK
When you pass the jobId
(received in response to the transcription API) to the API below, it fetches the status and results of the job.
curl --location 'https://voice.neuralspace.ai/api/v2/jobs/{{jobId}}' \
--header 'Authorization: {{API_KEY}}'
Using the jobId
(received in response to the transcription API) the snippet below can be executed to fetch the status and results of the job.
result = vai.get_job_status(jobId)
print(result)
The response of the request above appears as follows:
{
...
"data": {
...
"result": {
"transcription": {
...
"segments": [
{
"startTime": 6.741562500000001,
"endTime": 8.43,
"text": "We've been at this for hours\nnow.",
"speaker": "Speaker 1"
},
{
"startTime": 8.43,
"endTime": 9.55,
"text": "Have you found anything useful\nin",
"speaker": "Speaker 1"
},
{
"startTime": 9.55,
"endTime": 10.51,
"text": "any of those books?",
"speaker": "Speaker 1"
},
...
]
}
}
}
}
The difference between when guidelines are enabled and when they are not is that the segments that are returned are in line with the guidelines provided instead of the default outputs when enabled. The rest of the features and outpus work as expected.
Troubleshooting and FAQ
Sub-issues? Check out our FAQ page. If you still need help, feel free to reach out to us directly at support@neuralspace.ai or join our Slack community.