• DocumentCode
    2177834
  • Title

    A frame mapping based HMM approach to cross-lingual voice transformation

  • Author

    Qian, Yao ; Xu, Ji ; Soong, Frank K.

  • Author_Institution
    Microsoft Res. Asia, Beijing, China
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    5120
  • Lastpage
    5123
  • Abstract
    Cross-lingual voice transformation is challenging when source language (L1) and target language (L2) are very different in corresponding phonetics and prosodies. We propose a frame mapping based HMM approach to this problem. The source speaker\´s speech data is first warped in frequency toward the target speaker by mapping corresponding formants of selected vowels. The parameter trajectories of the warped data are then "tiled" with the frames in target speaker\´s L2 data. The tiled new trajectories then form a simulated training set of target speaker in L1 and it is used to train an HMM TTS. With a bilingual (Mandarin and English) source speaker and a monolingual (English) target speaker, the frame mapping-based approach is capable of generating highly intelligible, good quality speech data in L1 (Mandarin), which sounds rather close to the target speaker. The good performance of the cross-lingual voice transformation is confirmed with speaker similarity, naturalness and intelligibility evaluations subjectively.
  • Keywords
    hidden Markov models; languages; speaker recognition; HMM TTS; bilingual source speaker; cross-lingual voice transformation; frame mapping based HMM approach; monolingual target speaker; parameter trajectory; phonetic; prosody; source language; speaker speech data source; speech data quality; target language; Adaptation models; Data models; Hidden Markov models; Speech; Training; Trajectory; Transforms; Cross-lingual; HMM-based TTS; VTLN;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947509
  • Filename
    5947509