Speech-to-Text Conversion & Threat Detection
Public SectorBusiness Impacts
$12M
Per year in potential cost savings
Customer Key Facts
- Location : North America
- Industry : Government
Problem Context
A federal agency responsible for the care, custody, and control of over 175,000 incarcerated individuals has a large collection of audio file transcripts from inmates’ telephonic conversations. It invests around $12 million a year in human translation services to obtain transcripts of the audio files to supervise the prisoners and take necessary actions if any threat is identified.
Challenges
- Single channel provided in the audio recording with multiple people having conversations required speaker diarization
- For Spanish audio files where a substantial pause between the two speakers doesn’t exist, the context of the conversation changes, affecting the transcribed output because of overlapping conversations
Technologies Used
Google Cloud Functions
Google Speech-to-Text API
Google Translate API
Google's AutoML
Google Cloud Storage
App Engine
Automation and Digitization of Audio Transcripts for Safety at Scale
In an effort to better monitor the inmates' phone calls as well as lower the cost involved, they wanted to streamline the process of identifying by utilizing automated machine translation. This would increase the accuracy, reduce the latency involved in the entire process of transcription, and significantly reduce the human effort involved.
Solution
Quantiphi built a custom speech-to-text translation solution for converting prison calls recorded in Spanish into a corresponding textual format, as well as a highly interactive User Interface to show the translated transcriptions with threat and sentiment analysis.
Result
- Reduced manual effort
- Millions of dollars in cost savings
- Future potential to flag events of threat from conversations