Polish Dataset
Polski zbiór danych
Overview
Title
Polish Language Dataset
Dataset Type
Media Audio
Description
Licensable Public domain audio/video files such as interviews, podcasts etc – 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes.
Use Case
ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling
Data Set Details
Total hours
269
Sample Rate
16 kHz
Audio Channel
Mono
Recording Platform
Web Sourcing
Audio Format
.wav
Transcription Format
.json
WER (%)
5
Data Set Demographics
Country
Poland
Language
Polish
Gender
Female 173, Male 354, Unknown 6
Number of Speakers
533
Age
18-50
Overview
Title
Polish Language Dataset
Dataset Type
TTS
Description
Single-utterance recordings, which tend to fall in the 5 to 30 second range.
Use Case
ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling
Data Set Details
Total hours
1,482
Sample Rate
48 kHz
Audio Channel
Mono
Recording Platform
Mobile App
Audio Format
.wav
Transcription Format
.json
WER (%)
5
Data Set Demographics
Country
Poland
Language
Polish
Gender
Female 1324, Male 701, Unknown 24
Number of Speakers
2,049
Age
18-50
Featured Clients
Empowering teams to build world-leading AI products.
Can’t find what you are looking for?
New off-the-shelf datasets are being collected across all data types
Contact us now to let go of your audio/speech training data collection worries