American English Scripted Monologue, emotions

Start training, testing or fine-tuning your speech models with 1927 hours of American English recorded on device. This dataset is perfect for those who are looking for short, high-quality recordings of with extremely accurate transcriptions for their ASR models. Recordings are saved .wav files with a sample rate of 48000 and a bit depth of 16 bit.

Emotions

Dataset specs

Type

Audio

Sound quality

≥48kHz, 16 bit per channel

Region/Locale

EN,

en-US

Amount

1.9K hours

Content typeScripted SpeechDuration< 1mCompressionNone/LosslessDataset SubtypeMonologueDomainEmotionsFile Formatwav

Leverage

Advance AI's understanding and generation of natural American English speech.

Empower your technologies with precise, controlled speech signals, designed to build accurate and reliable American English language understanding at scale.

Use cases

Speech Recognition and Analysis
Natural Language Processing and Understanding
Keyword Spotting and Voice Command Recognition

Do you need a specific dataset? edit

We understand the uniqueness of every project. That's why we offer customizable dataset solutions to match your specific requirements.