General Knowledge Prompt and Response Data for LLMs

Live Data
Large Language Models

About this Dataset has added one of the most valuable data assets for natural language understanding and LLM training! This dataset contains unprompted, user-initiated prompts from one million unique users interacting with a generic digital assistant. The data is cleansed of PII (Personally Identifiable Information) and each prompt has intent and entity annotations. Queries cover hundreds of intents and subintents such as asking about the weather, searching for businesses, playing music, knowledge questions, and more.

License Information

This dataset is covered by our standard Data License Agreement. The license agreement is perpetual and allows for the commercialization of all models built on the data.

Sample Preview


Download Sample

Tell us about yourself, and get access to a sample.
All fields are required

By clicking on the appropriate button or by downloading, installing, accessing, and/or using the data sample, you are agreeing with Privacy Policy, Terms of Use, and Data License Agreement.

You might also be interested in:

STEM Q&A Pairs

STEM Question-Answer Dataset of 150,000 units coming soon
DAI logo hosts the leading online marketplace for buying and selling AI data, tools and models, and offers professional services to help deliver success in complex machine learning projects. is a community of AI professionals building fair, accessible and ethical AI of the future.
1201 3rd Avenue, STE 2200, Seattle WA
[email protected]
Wired logo
Forbes 2019 AI50 logo
CB insights logo
Forbes 2020 logo
Inc. 5000 logo
PME logo

© 2023 DefinedCrowd. All rights reserved.