Become a partnerGet in touch

American English Scripted Monologue, TTS

Start training, testing or fine-tuning your speech models with 15 hours of American English recorded on device. This dataset is perfect for those who are looking for short, high-quality recordings of with extremely accurate transcriptions for their ASR models. Recordings are saved .wav files with a sample rate of 48000 and a bit depth of 32 bit.

Start training, testing or fine-tuning your speech models with 15 hours of American English recorded on device. This dataset is perfect for those who are looking for short, high-quality recordings of with extremely accurate transcriptions for their ASR models. Recordings are saved .wav files with a sample rate of 48000 and a bit depth of 32 bit.

Start training, testing or fine-tuning your speech models with 15 hours of American English recorded on device. This dataset is perfect for those who are looking for short, high-quality recordings of with extremely accurate transcriptions for their ASR models. Recordings are saved .wav files with a sample rate of 48000 and a bit depth of 32 bit.

Start training, testing or fine-tuning your speech models with 15 hours of American English recorded on device. This dataset is perfect for those who are looking for short, high-quality recordings of with extremely accurate transcriptions for their ASR models. Recordings are saved .wav files with a sample rate of 48000 and a bit depth of 32 bit.

TTS

Dataset specs

Type

Audio

Sound quality

≥48kHz, ≥32 bit per channel

Region/Locale

EN,

en-US

Amount

15 hours

Content typeScripted SpeechDuration< 1mCompressionNone/LosslessDataset SubtypeMonologueDomainTTSFile Formatwav

Leverage

  • Advance AI's understanding and generation of natural American English speech.

  • Empower your technologies with precise, controlled speech signals, designed to build accurate and reliable American English language understanding at scale.

Use cases

  • Speech Recognition and Analysis

  • Natural Language Processing and Understanding

  • Keyword Spotting and Voice Command Recognition

Do you need a specific dataset? edit

We understand the uniqueness of every project. That's why we offer customizable dataset solutions to match your specific requirements.

Dataset specs

Type

Audio

Sound quality

≥48kHz, ≥32 bit per channel

Region/Locale

EN,

en-US

Amount

15 hours

Content typeScripted SpeechDuration< 1mCompressionNone/LosslessDataset SubtypeMonologueDomainTTSFile Formatwav

Couldn’t find the right dataset for you?

Get in touch

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo

Datasets

Marketplace

Solutions

Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier Program
Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier ProgramCCPA Privacy StatementWhistleblowing ChannelCandidate Privacy Statement

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo