Defined.ai Awarded ISO 42001 Certification, Strengthening Leadership in Responsible AI DataRead the press release

Become a partnerGet in touch
Get in touch
  • Browse Marketplace
  • Data Annotation

    Model-in-the-loop, expert-verified labeling for text, audio, image and video

    Machine Translation

    High-quality multilingual content for global AI systems

    Data Collection

    Global, diverse datasets for AI training at scale

    Conversational AI

    Natural, bias-free voice and chat experiences worldwide

    Data & Model Evaluation

    Rigorous testing to ensure accuracy, fairness and quality

    Accelerat.ai

    Smarter multilingual AI agent support for global businesses


    Industries

Find the right datasets for you

Suggested filters

Healthcareimage

Dataset title

Domain

Type

Locale

Amount

English Question-Answer pairs

4,000,000 Question-Answer pairs in English regarding medical topics between patients and doctors.

Healthcare
Question - Answer

EN

4M

Transcribed English Doctor-Patient conversations

More than 8K ethically collected conversations between physicians and their patients, transcribed verbatim.

Healthcare

EN,

en-US

8K

SOAP notes of English Doctor-Patient Conversations

More than 6K professinally annotated SOAP notes of conversations between physicians and their patients.

Healthcare

EN,

en-US

6.9K

English Question-Answer pairs

55,000 Question-Answer pairs in English in English regarding medical topics between patients and doctors.

Healthcare
Question - Answer

EN

55K

Medical Claims Data for AI Model Training

Medical claims data from 14,000,000 patients.

Healthcare

14M

Longitudinal Data in Oncology for AI Model Development

Unique and extensive longitudinal data aggregating a wealth of information from 111,000 oncology patients.

Healthcare

111K

Wearable Health Data for AI Model Training

Consumer wearable health and activity data from 49,000 patients.

Healthcare

49K

English Wellness Articles

More than 2M tokens in 1,248 English articles covering wellness topics.

Healthcare
Wellness

EN

2.4M tokens

Showing 8 of 8 datasets

Datasets per page

English Question-Answer pairs

Domain:

Healthcare
Question - Answer

Amount:

4M

Locale:

EN

Transcribed English Doctor-Patient conversations

Amount:

8K

Locale:

EN, en-US

SOAP notes of English Doctor-Patient Conversations

Amount:

6.9K

Locale:

EN, en-US

English Question-Answer pairs

Amount:

55K

Locale:

EN

Medical Claims Data for AI Model Training

Amount:

14M

Longitudinal Data in Oncology for AI Model Development

Amount:

111K

Wearable Health Data for AI Model Training

Amount:

49K

English Wellness Articles

Domain:

Healthcare
Wellness

Amount:

2.4M tokens

Locale:

EN

Showing 8 of 8 datasets

1/1

New datasets

Medical Claims Data for AI Model Training

Healthcare

Longitudinal Data in Oncology for AI Model Development

Healthcare

Wearable Health Data for AI Model Training

Healthcare

Hot datasets

Live Spanish Call Center Audio Dataset

Call Center

DICOM Medical Imaging Dataset with Clinical Reports

Healthcare

Multimodal Dataset for Household Robotics

Robotics
3D and Lidar

Couldn’t find the right dataset for you?

Get in touch

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo

Datasets

Marketplace

Dataset Types

Privacy and Cookie PolicyTerms & ConditionsData License AgreementSupplier Code of ConductCCPA Privacy StatementWhistleblowing ChannelCandidate Privacy Statement

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo