Inference Container

Inference containers are Docker containers which can be used to transcribe audio files for the supported languages. These containers are given to our customers on demand. One of the main advantage of using an inference container is that it can run offline, and on any virtual machine that has Docker installed on it.

Prerequisites

If you are using our Inference Containers you would have received the following from us:

A license.txt file which will be provided by us
Access to our Docker Registry and a URL to pull the Docker image
- You will always have access to the latest version of the Docker image which is tagged as latest
- Along with the latest version we also provide major versions tagged as v1, v2, etc.
Docker is installed on your system
System resource requirements per container
- 1 vCPU
- 7-8 GB RAM
Login to our Docker container registry

Transcribe an audio file

You can directly run the docker image on your system and mount the audio file to /input.audio in the container and the transcription output will be returned in stdout.

docker run -v <YOUR-AUDIO-FILE-PATH>:/input.audio -v <YOUR-LICENSE.TXT-FILE-PATH>:/license.txt -v <PATH-TO-YOUR-OUTPUT-DIRECTORY>:/app/output <YOUR-DOCKER-IMAGE-PATH>:latest

This script will write the final transcription response in response.json inside the output directory you have provided.

Normalize numbers in the transcript

Use the following flag to normalize numbers in the transcript

Normalize all numbers to words

docker run -v <YOUR-AUDIO-FILE-PATH>:/input.audio -v <YOUR-LICENSE.TXT-FILE-PATH>:/license.txt -v <PATH-TO-YOUR-OUTPUT-DIRECTORY>:/app/output <YOUR-DOCKER-IMAGE-PATH>:latest -nf words

Normalize all numbers to digits

docker run -v <YOUR-AUDIO-FILE-PATH>:/input.audio -v <YOUR-LICENSE.TXT-FILE-PATH>:/license.txt -v <PATH-TO-YOUR-OUTPUT-DIRECTORY>:/app/output <YOUR-DOCKER-IMAGE-PATH>:latest -nf digits

This script will write the final transcription response in response.json inside the output directory you have provided.

Log transcription response

In case you want save request and response logs for the audio files you are processing, you can mount a logs directory to the container and persist logs in your local storage.

mkdir logs
sudo chown -R nobody:nogroup logs
sudo chmod -R a+rwx logs

docker run -v <YOUR-AUDIO-FILE-PATH>:/input.audio -v <YOUR-LICENSE.TXT-FILE-PATH>:/license.txt -v logs:/logs <YOUR-DOCKER-IMAGE-PATH>:latest > "response.json"

This will write the encrypted logs in the folder you have mounted.

Response Format

When the Docker container has run successfully it will return an exit code 0. In case of any error you will receive an exit code 1.

{
    "output": {
        "transcription": "زارني في أوائل الشهر بدري",
        "timestamps": [
            {
                "word": "زارني",
                "start_time": 1.1344375,
                "end_time": 1.595875,
                "confidence_score": 0.5,
                "normalized_word": "زارني"
            },
            {
                "word": "في",
                "start_time": 1.676125,
                "end_time": 1.8165625,
                "confidence_score": 0.5,
                "normalized_word": "في"
            },
            {
                "word": "أوائل",
                "start_time": 1.9369375,
                "end_time": 2.3381875,
                "confidence_score": 0.5,
                "normalized_word": "أوائل"
            },
            {
                "word": "الشهر",
                "start_time": 2.3783125,
                "end_time": 2.799625,
                "confidence_score": 0.5,
                "normalized_word": "الشهر",
            },
            {
                "word": "بدري",
                "start_time": 2.960125,
                "end_time": 3.2508125,
                "confidence_score": 0.5,
                "normalized_word": "بدري"
            }
        ]
    },
    "success": true,
    "message": "Successfully transcribed file."
}

Inference Container

Prerequisites​

Transcribe an audio file​

Normalize numbers in the transcript​

Log transcription response​

Response Format​

Prerequisites

Transcribe an audio file

Normalize numbers in the transcript

Log transcription response

Response Format