• DocumentCode
    417227
  • Title

    Refining segmental boundaries for TTS database using fine contextual-dependent boundary models

  • Author

    Wang, Lijuan ; Zhao, Yong ; Chu, Min ; Zhou, Jianlai ; Cao, Zhigang

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Tsinghua, Beijing, China
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    This paper proposed a post-refining method with fine contextual-dependent GMM for the auto-segmentation task. A GMM trained with a super feature vector extracted from multiple evenly spaced frames near the boundary is suggested to describe the waveform evolution across a boundary. CART is used to cluster acoustically similar GMM, so that the GMM for each leaf node is reliably trained by the limited manually labeled boundaries. An accuracy of 90% is thus achieved when only 250 manually labeled sentences are provided to train the refining models.
  • Keywords
    Gaussian distribution; feature extraction; hidden Markov models; pattern clustering; speech synthesis; CART; GMM; TTS database; auto-segmentation task; fine contextual-dependent boundary models; manually labeled boundaries; multiple evenly spaced frames; pattern clustering; post-refining method; segmental boundaries; super feature vector extraction; waveform evolution; Asia; Automatic speech recognition; Context modeling; Databases; Feature extraction; Hidden Markov models; Labeling; Neural networks; Speech synthesis; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326067
  • Filename
    1326067