German Dataset
Deutscher Datensatz
High-Quality German Call-Center, and IVR Dataset for AI & Speech Models
Overview
Title
German Language Dataset
Dataset Type
Call-Center
Description
Unscripted, synthetic telephonic conversation between “agent” and “customer”, Approx. Audio Duration (Range) 5-15 Minutes.
Use Case
ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling
Data Set Details
Total hours
64
Sample Rate
8 kHz
Audio Channel
Stereo
Recording Platform
Desktop
Audio Format
.wav
Transcription Format
.json
WER (%)
5
Data Set Demographics
Country
German
Language
German
Gender
Female 478, Male 1440, Unknown 0
Number of Speakers
1,918
Age
18-50
Overview
Title
German Language Dataset
Dataset Type
IVR
Description
Human to Machine. An IVR type of flow where there is a TTS prompt (e.g. ”How may I help you”) followed by a spontaneous human response.
Use Case
ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling
Data Set Details
Total hours
200
Sample Rate
8 kHz
Audio Channel
Stereo
Recording Platform
Desktop
Audio Format
.wav
Transcription Format
.json
WER (%)
5
Data Set Demographics
Country
German
Language
German
Gender
Female 10115, Male 8750, Unknown 0
Number of Speakers
18,865
Age
18-50
Featured Clients
Empowering teams to build world-leading AI products.

Can’t find what you are looking for?
New off-the-shelf datasets are being collected across all data types
Contact us now to let go of your audio/speech training data collection worries