Overview
Title
Arabic Language Dataset
Dataset Type
General Conversation
Description
Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) – 15-60 minutes, Arabic from Gulf countries.
Use Case
ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling
Data Set Details
Total hours
292
Sample Rate
8 kHz
Audio Channel
Dual
Recording Platform
Desktop
Audio Format
.wav
Transcription Format
.json
WER (%)
5
Data Set Demographics
Country
Arabic
Language
Arabic
Gender
Female 838 Male 1209 Unknown 78
Number of Speakers
706
Age
18-50
Overview
Title
Arabic Language Dataset
Dataset Type
TTS
Description
Single-utterance recordings, which tend to fall in the 5 to 30 second range.
Use Case
ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling
Data Set Details
Total hours
1,947
Sample Rate
48 kHz
Audio Channel
Mono
Recording Platform
Mobile App
Audio Format
.wav
Transcription Format
.json
WER (%)
5
Data Set Demographics
Country
Arabic
Language
Arabic
Gender
Female 838 Male 1209 Unknown 78
Number of Speakers
2,125
Age
18-50
Featured Clients
Empowering teams to build world-leading AI products.
Can’t find what you are looking for?
New off-the-shelf datasets are being collected across all data types
Contact us now to let go of your audio/speech training data collection worries