Become a partnerGet in touch

English books

If you are looking for large amounts of published text data that have entered the public domain, this digitization of public domain texts is what you are looking for. Containing 92655 carefully restored and digitized books in English, both fiction and non-fiction, these public domain books are free from copyright restrictions and cover a diverse range of genres, including literature, science, philosophy, history, arts, social sciences, and more.

If you are looking for large amounts of published text data that have entered the public domain, this digitization of public domain texts is what you are looking for. Containing 92655 carefully restored and digitized books in English, both fiction and non-fiction, these public domain books are free from copyright restrictions and cover a diverse range of genres, including literature, science, philosophy, history, arts, social sciences, and more.

If you are looking for large amounts of published text data that have entered the public domain, this digitization of public domain texts is what you are looking for. Containing 92655 carefully restored and digitized books in English, both fiction and non-fiction, these public domain books are free from copyright restrictions and cover a diverse range of genres, including literature, science, philosophy, history, arts, social sciences, and more.

If you are looking for large amounts of published text data that have entered the public domain, this digitization of public domain texts is what you are looking for. Containing 92655 carefully restored and digitized books in English, both fiction and non-fiction, these public domain books are free from copyright restrictions and cover a diverse range of genres, including literature, science, philosophy, history, arts, social sciences, and more.

General
General

Dataset specs

Type

Text

File format

json

Region/Locale

EN

Amount

92.7K

Dataset SubTypeBooksDomainVariousFile Formattxt, pdf, json

Leverage

  • The diverse range of genres, from fiction to technical writing, makes it perfect for training generative AI models for language (like GPT-based models) to produce coherent, contextually relevant text across various domains.

Use cases

  • Train models to analyze emotional tone and sentiment in various types of written content, from fiction to biography and self-help books, to understand how they are expressed in different contexts.

  • The dataset�s wide range of subjects allows for fine-tuning models for diverse domains such as business economics, mathematics and philosophy, enabling the development of highly specialized AI models.

Do you need a specific dataset?

We understand the uniqueness of every project. That's why we offer customizable dataset solutions to match your specific requirements.

Dataset specs

Type

Text

File format

json

Region/Locale

EN

Amount

92.7K

Dataset SubTypeBooksDomainVariousFile Formattxt, pdf, json

Couldn’t find the right dataset for you?

Get in touch

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo

Datasets

Marketplace

Solutions

Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier Program
Privacy and Cookie PolicyTerms & Conditions (T&M)Data License AgreementSupplier ProgramCCPA Privacy StatementWhistleblowing ChannelCandidate Privacy Statement

© 2026 DefinedCrowd. All rights reserved.

Award logo
Award logo
Award logo
Award logo
Award logo
Award logo