AI in Finance: Investing in Accurate Automatic Speech Recognition
Theseus AI cut their Word Error Rate by over 90%, transforming their financial speech recognition tool.
In the world of financial services, every word counts. Mis-heard terms or misrecognized figures in transcripts of earnings calls, analyst briefings or investor conference calls can lead to misinformed decisions, compliance risks or reputational damage. That’s the problem Theseus AI set out to solve: improving the accuracy of speech recognition in a domain where precision is non-negotiable.
While general speech recognition models (such as Whisper-large-v3) perform admirably on open-source French datasets, they often fall short where specialized vocabulary, accents and context matter. These pre-trained models typically register a Word Error Rate (WER) of ~4.5% on general French speech. But when faced with financial jargon, numbers, abbreviations and high stakes the WER can skyrocket. Theseus AI found that its baseline for financial audio was around 18%: far too high for enterprise or regulatory uses.
That’s where domain-specific fine-tuning comes in. By partnering with Defined.ai to source curated, high-quality financial speech data, Theseus AI, facilitated by RunPod, was able to retrain Whisper-large-v3 on data that mirrored the real use case. The result: a dramatic drop in errors and a leap forward in trust, utility and performance.
Partnering with Defined.ai got us access to a financial-specific audio database in French and English with over 400 hours of annotated data. — Theseus AI
Our customer: On the cutting edge of financial services trends
Theseus AI is a cutting-edge research group focused on advancing transcription technology for domain-specific applications. With finance as one of the most demanding fields for accuracy, they needed an ASR solution that could keep up with the precision required by traders, analysts and compliance officers alike.
The context: What is ASR?
Automatic Speech Recognition (ASR) uses AI to convert spoken language into text. It powers everything from voice assistants to meeting transcription tools. But while general ASR models are trained on broad datasets and perform well in everyday contexts, they often underperform when applied to specialized domains.
Similarly, most open-source speech datasets are built for general conversation, media or broadcast. While powerful models like Whisper-large-v3 achieve strong results on these, they struggle when faced with domain-specific jargon.
These gaps meant the model missed key terms, misheard proper nouns or scrambled numbers, rendering it unreliable for Theseus AI’s financial use cases.
The solution: Better financial data for machine learning
Theseus AI turned to Defined.ai for curated, ethically sourced and domain-specific French financial audio data. With it, they fine-tuned Whisper-large-v3 to better understand the specialized vocabulary, cadence and context of the financial domain.
RunPod played a pivotal role by providing a cost-effective and scalable cloud platform equipped with high-performance GPUs. This setup enabled Theseus AI to access Defined.ai’s specialized datasets hosted on RunPod's servers for a limited period.
Key elements of the approach:
- Expert-curated speech data designed to reflect authentic financial interactions
- Targeted fine-tuning on top of the foundation model
- Rigorous validation to measure accuracy improvements in real-world conditions
The results: Improving AI Word Error Rate in fintech
The impact was transformative:
- WER plummeted from ~18% to just ~1.7% on the financial validation dataset
- Compared to the 4.5% WER seen with open-source French corpora, the fine-tuned model outperformed by more than double
- The model became 10× more accurate relative to its baseline on financial speech
As Theseus AI put it:
“Our results are really promising. Whisper-large-v3 starts with a WER of 18% on the validation dataset: huge compared to ~4.5% WER with generalist French datasets. We fine-tuned it and later reached ~1.7% WER on financial-specific data.”
With this breakthrough, Theseus AI proved that specialized data is the bridge between general AI and expert-level performance.
Learn how Defined.ai can help you lower your AI model's WER: speak to an expert!