Scam Alert: we’ve detected unauthorized use of the Defined.ai name.Read the notice

Become a partnerGet in touch
Get in touch
  • Browse Marketplace
  • Data Annotation

    Human-led labeling for text, audio, image and video

    Machine Translation

    High-quality multilingual content for global AI systems

    Data Collection

    Global, diverse datasets for AI training at scale

    Conversational AI

    Natural, bias-free voice and chat experiences worldwide

    Data & Model Evaluation

    Rigorous testing to ensure accuracy, fairness and quality

    Accelerat.ai

    Smarter multilingual AI agent support for global businesses


    Industries

Find the right datasets for you

Suggested filters

Healthcareimage

Dataset title

Domain

Type

Locale

Amount

Venezuelan Spanish Podcasts

37 hours of Venezuelan Spanish simulated podcasts, recorded with studio quality.

Various
Podcast

es-VE

37 hours

Lebanese Arabic Podcasts

5 hours of Moroccan Arabiclb simulated podcasts, recorded with studio quality.

Various
Podcast

AR

5 hours

Mexican Spanish Podcasts

17 hours of Mexican Spanish simulated podcasts, recorded with studio quality.

Various
Podcast

es-MX

17 hours

German Podcasts

231 hours of German simulated podcasts, recorded with studio quality.

Various
Podcast

de-DE,

DE

30K hours

Moroccan French Podcasts

1 hours of Moroccan French simulated podcasts, recorded with studio quality.

Various
Podcast

FR,

fr-MA

1 hours

Tunisian Arabic Podcasts

5 hours of Moroccan Arabictn simulated podcasts, recorded with studio quality.

Various
Podcast

AR

5 hours

American Spanish Podcasts

6 hours of American Spanish simulated podcasts, recorded with studio quality.

Various
Podcast

es-US

6 hours

Libyan Arabic Podcasts

8 hours of Moroccan Arabicly simulated podcasts, recorded with studio quality.

Various
Podcast

AR

8 hours

Jordanian Arabic Podcasts

17 hours of Jordanian Arabic simulated podcasts, recorded with studio quality.

Various
Podcast

AR,

ar-JO

17 hours

Japanese Podcasts

156 hours of Japanese simulated podcasts, recorded with studio quality.

Various
Podcast

JA,

ja-JP

1.3K hours

Showing 10 of 181 datasets

...

Datasets per page

Venezuelan Spanish Podcasts

Domain:

Various
Podcast

Amount:

37 hours

Locale:

es-VE

Lebanese Arabic Podcasts

Amount:

5 hours

Locale:

AR

Mexican Spanish Podcasts

Amount:

17 hours

Locale:

es-MX

German Podcasts

Amount:

30K hours

Locale:

de-DE, DE

Moroccan French Podcasts

Amount:

1 hours

Locale:

FR, fr-MA

Tunisian Arabic Podcasts

Amount:

5 hours

Locale:

AR

American Spanish Podcasts

Amount:

6 hours

Locale:

es-US

Libyan Arabic Podcasts

Amount:

8 hours

Locale:

AR

Jordanian Arabic Podcasts

Amount:

17 hours

Locale:

AR, ar-JO

Japanese Podcasts

Amount:

1.3K hours

Locale:

JA, ja-JP

Showing 10 of 181 datasets

1/19

New datasets

Medical Claims Data for AI Model Training

Healthcare

Longitudinal Data in Oncology for AI Model Development

Healthcare

Wearable Health Data for AI Model Training

Healthcare

Couldn’t find the right dataset for you?

Get in touch

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo

Datasets

Marketplace

Solutions

Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier Program
Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier ProgramCCPA Privacy StatementWhistleblowing ChannelCandidate Privacy Statement

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo