Czech Scripted Monologue
About the Dataset
This audio dataset contains 1249 hours of Chezh Speech Data in Banking, Telecommunication, Insurance and Retail domains, recored by native Czech speakers from Czech Republic.
Domain distribution per dataset:
- 152.93 hours of Automotive
- 271.13 hours of Banking
- 276.02 hours of Insurance
- 270 hours hours of Retail
- 279.68 hours of Telecommunication
The speakers are presented with a prompt (script) and asked to read it out loud and record. Our clients will receive an audio recording, the prompt and information about the speaker. The audio is recorded on-device, typically in 16Khz 16 bit. We also provide information on which device each record was recorded.
The dataset is covered by Defined.ai's standard license agreement. The license agreement is perpetual and allows for the commercialization of all models built on the data.