Scam Alert: We’ve detected unauthorized use of the Defined.ai name.Read the notice

Become a partnerGet in touch
Get in touch
  • Browse Marketplace
  • Data Annotation

    Model-in-the-loop, expert-verified labeling for text, audio, image and video

    Machine Translation

    High-quality multilingual content for global AI systems

    Data Collection

    Global, diverse datasets for AI training at scale

    Conversational AI

    Natural, bias-free voice and chat experiences worldwide

    Data & Model Evaluation

    Rigorous testing to ensure accuracy, fairness and quality

    Accelerat.ai

    Smarter multilingual AI agent support for global businesses


    Industries

Find the right datasets for you

Suggested filters

Healthcareimage

Dataset title

Domain

Type

Locale

Amount

Aspect-Based Sentiment Annotations in European Spanish

more than 50K Aspect-Based Sentiment Annotations of product reviews in European Spanish

Various

es-ES

59.7K

Aspect-Based Sentiment Annotations in Japanese

more than 50K Aspect-Based Sentiment Annotations of product reviews in Japanese

Various

JA,

ja-JP

55.2K

Aspect-Based Sentiment Annotations in German

more than 50K Aspect-Based Sentiment Annotations of product reviews in German

Various

DE,

de-DE

59.7K

Aspect-Based Sentiment Annotations in American English

more than 50K Aspect-Based Sentiment Annotations of product reviews in American English

Various

EN,

en-US

60K

Aspect-Based Sentiment Annotations in Mandarin Chinese

more than 50K Aspect-Based Sentiment Annotations of product reviews in Mandarin Chinese

Various

ZH,

zh-CN

57.5K

Venezuelan Spanish Podcasts

37 hours of Venezuelan Spanish simulated podcasts, recorded with studio quality.

Various
Podcast

es-VE

37 hours

Lebanese Arabic Podcasts

5 hours of Moroccan Arabiclb simulated podcasts, recorded with studio quality.

Various
Podcast

AR

5 hours

Mexican Spanish Podcasts

17 hours of Mexican Spanish simulated podcasts, recorded with studio quality.

Various
Podcast

es-MX

17 hours

German Podcasts

231 hours of German simulated podcasts, recorded with studio quality.

Various
Podcast

de-DE,

DE

30K hours

Moroccan French Podcasts

1 hours of Moroccan French simulated podcasts, recorded with studio quality.

Various
Podcast

FR,

fr-MA

1 hours

Showing 10 of 195 datasets

...

Datasets per page

Aspect-Based Sentiment Annotations in European Spanish

Domain:

Various

Amount:

59.7K

Locale:

es-ES

Aspect-Based Sentiment Annotations in Japanese

Amount:

55.2K

Locale:

JA, ja-JP

Aspect-Based Sentiment Annotations in German

Amount:

59.7K

Locale:

DE, de-DE

Aspect-Based Sentiment Annotations in American English

Amount:

60K

Locale:

EN, en-US

Aspect-Based Sentiment Annotations in Mandarin Chinese

Amount:

57.5K

Locale:

ZH, zh-CN

Venezuelan Spanish Podcasts

Domain:

Various
Podcast

Amount:

37 hours

Locale:

es-VE

Lebanese Arabic Podcasts

Amount:

5 hours

Locale:

AR

Mexican Spanish Podcasts

Amount:

17 hours

Locale:

es-MX

German Podcasts

Amount:

30K hours

Locale:

de-DE, DE

Moroccan French Podcasts

Amount:

1 hours

Locale:

FR, fr-MA

Showing 10 of 195 datasets

1/20

New datasets

Medical Claims Data for AI Model Training

Healthcare

Longitudinal Data in Oncology for AI Model Development

Healthcare

Wearable Health Data for AI Model Training

Healthcare

Hot datasets

Live Spanish Call Center Audio Dataset

Call Center

DICOM Medical Imaging Dataset with Clinical Reports

Healthcare

Multimodal Dataset for Household Robotics

Robotics
3D and Lidar

Couldn’t find the right dataset for you?

Get in touch

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo

Datasets

Marketplace

Dataset Types

Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier ProgramCCPA Privacy StatementWhistleblowing ChannelCandidate Privacy Statement

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo