Emilia

Emilia

Large scale multilingual speech generation dataset

  • Provide over 101000 hours of high-quality voice data in six languages
  • Transcription of speech and text in Chinese, English, Japanese, Korean, German, and French
  • From diversified video platforms and podcasts on the Internet, with rich content types
  • Support the use of Emilia Piper open-source preprocessing pipeline for data preprocessing
  • Allow researchers to download raw audio files and reconstruct datasets
  • Emilia Piper supports custom preprocessing of speech data to meet specific research needs

Product Details

Emilia is an open-source multilingual field speech dataset designed specifically for large-scale speech generation research. It contains over 101000 hours of high-quality speech data in six languages and corresponding text transcription, covering various speaking styles and content types such as talk shows, interviews, debates, sports commentary, and audiobooks.