Become a partnerGet in touch

English Doctor-Patient Conversations

Start training, testing or fine-tuning your speech models with 2000 hours of English live, doctor-patient conversations. Human-generated transcriptions are available, as well as professionally annotated SOAP-structured notes. This dataset is perfect for those who are looking for instances of spontaneous speech between medical professionals and patients, for a variety of use cases such as ASR training, automated SOAP note generation, and many more!

Start training, testing or fine-tuning your speech models with 2000 hours of English live, doctor-patient conversations. Human-generated transcriptions are available, as well as professionally annotated SOAP-structured notes. This dataset is perfect for those who are looking for instances of spontaneous speech between medical professionals and patients, for a variety of use cases such as ASR training, automated SOAP note generation, and many more!

Start training, testing or fine-tuning your speech models with 2000 hours of English live, doctor-patient conversations. Human-generated transcriptions are available, as well as professionally annotated SOAP-structured notes. This dataset is perfect for those who are looking for instances of spontaneous speech between medical professionals and patients, for a variety of use cases such as ASR training, automated SOAP note generation, and many more!

Start training, testing or fine-tuning your speech models with 2000 hours of English live, doctor-patient conversations. Human-generated transcriptions are available, as well as professionally annotated SOAP-structured notes. This dataset is perfect for those who are looking for instances of spontaneous speech between medical professionals and patients, for a variety of use cases such as ASR training, automated SOAP note generation, and many more!

Healthcare
Healthcare

Dataset specs

Type

Audio

Sound quality

16kHz, 16 bit per channel

Region/Locale

EN

Amount

2K hours

Content typeMedical ConversationDuration1-10mCompressionNone/LosslessChannel separationNoDataset SubtypeCall CenterDomainHealthcareFile Formatwav

Leverage

  • Train healthcare AI on real clinical conversations, verified by medical experts and grounded in real-world care

Use cases

  • Train your AI models to understand and interpret medical conversations, providing valuable insights to assist in clinical decision-making.

  • Train AI models to develop chatbots and virtual assistants capable of providing personalized health advice, answering medical questions, and assisting with appointment scheduling and medication reminders.

Do you need a specific dataset?

We understand the uniqueness of every project. That's why we offer customizable dataset solutions to match your specific requirements.

Dataset specs

Type

Audio

Sound quality

16kHz, 16 bit per channel

Region/Locale

EN

Amount

2K hours

Content typeMedical ConversationDuration1-10mCompressionNone/LosslessChannel separationNoDataset SubtypeCall CenterDomainHealthcareFile Formatwav

Couldn’t find the right dataset for you?

Get in touch

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo

Datasets

Marketplace

Solutions

Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier Program
Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier ProgramCCPA Privacy StatementWhistleblowing ChannelCandidate Privacy Statement

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo