
Emilia
Large scale multilingual speech generation dataset
- Provide over 101000 hours of high-quality voice data in six languages
- Transcription of speech and text in Chinese, English, Japanese, Korean, German, and French
- From diversified video platforms and podcasts on the Internet, with rich content types
- Support the use of Emilia Piper open-source preprocessing pipeline for data preprocessing
- Allow researchers to download raw audio files and reconstruct datasets
- Emilia Piper supports custom preprocessing of speech data to meet specific research needs
Product Details
Emilia is an open-source multilingual field speech dataset designed specifically for large-scale speech generation research. It contains over 101000 hours of high-quality speech data in six languages and corresponding text transcription, covering various speaking styles and content types such as talk shows, interviews, debates, sports commentary, and audiobooks.