Business Impact

  • $12M

    Per year in potential cost savings

Customer Key Facts

  • Location : North America
  • Industry : Government

Problem Context

A federal agency responsible for the care, custody, and control of over 175,000 incarcerated individuals has a large collection of audio file transcripts from inmates’ telephonic conversations. It invests around $12 million a year in human translation services to obtain transcripts of the audio files to supervise the prisoners and take necessary actions if any threat is identified.




  • Single channel provided in the audio recording with multiple people having conversations required speaker diarization
  • For Spanish audio files where a substantial pause between the two speakers doesn’t exist, the context of the conversation changes, affecting the transcribed output because of overlapping conversations

Technologies Used

Google Cloud Functions
Google Speech-to-Text API
Google Translate API
Google's AutoML
Google Cloud Storage
App Engine

Automation and Digitization of Audio Transcripts for Safety at Scale

In an effort to better monitor the inmates’ phone calls as well as lower the cost involved, they wanted to streamline the process of identifying by utilizing automated machine translation. This would increase the accuracy, reduce the latency involved in the entire process of transcription, and significantly reduce the human effort involved.


Quantiphi built a custom speech-to-text translation solution for converting prison calls recorded in Spanish into a corresponding textual format, as well as a highly interactive User Interface to show the translated transcriptions with threat and sentiment analysis.


  • Reduced manual effort
  • Millions of dollars in cost savings
  • Future potential to flag events of threat from conversations

Looking for similar project?

Let's Talk

Get your digital transformation started

Let's Talk