Emilia

Large scale multilingual speech generation dataset

Provide over 101000 hours of high-quality voice data in six languages
Transcription of speech and text in Chinese, English, Japanese, Korean, German, and French
From diversified video platforms and podcasts on the Internet, with rich content types
Support the use of Emilia Piper open-source preprocessing pipeline for data preprocessing
Allow researchers to download raw audio files and reconstruct datasets
Emilia Piper supports custom preprocessing of speech data to meet specific research needs

Product Details

Emilia is an open-source multilingual field speech dataset designed specifically for large-scale speech generation research. It contains over 101000 hours of high-quality speech data in six languages and corresponding text transcription, covering various speaking styles and content types such as talk shows, interviews, debates, sports commentary, and audiobooks.

Emilia

Product Details

Related Projects

Udio v1.5

Ask the little universe

Speech to Note

SpeechGPT2