• DocumentCode
    1761163
  • Title

    Automatic Variation of the Degree of Articulation in New HMM-Based Voices

  • Author

    Picart, Benjamin ; Drugman, Thomas ; Dutoit, Thierry

  • Author_Institution
    TCTS Lab., Univ. of Mons, Mons, Belgium
  • Volume
    8
  • Issue
    2
  • fYear
    2014
  • fDate
    41730
  • Firstpage
    307
  • Lastpage
    322
  • Abstract
    This paper focuses on the automatic modification of the degree of articulation (hypo and hyperarticulation) of an existing standard neutral voice in the framework of HMM-based speech synthesis. Hypo and hyperarticulation refer to the production of speech respectively with a reduction and an increase of the articulatory efforts compared to the neutral style. Starting from a source speaker for which neutral, hypo and hyperarticulated speech data are available, statistical transformations are computed during the adaptation of the neutral speech synthesizer. These transformations are then applied to a new target speaker for which no hypo or hyperarticulated recordings are available. Four statistical methods are investigated, differing in the speaking style adaptation technique (model-space Linear Scaling LS vs. CMLLR) and in the speaking style transposition approach (phonetic vs. acoustic correspondence) they use. The efficiency of these techniques is assessed for the transposition of prosody and of filter coefficients separately. Besides we investigate which representation of the spectral envelope is the most suited for this purpose: MGC, LSP, PARCOR and LAR coefficients. Subjective evaluations are performed in order to determine which statistical transformation method achieves the highest performance in terms of segmental quality, reproduction of the articulation degree and speaker identity preservation. The most successful method is finally used for automatically modifying the degree of articulation of existing standard neutral voices.
  • Keywords
    filtering theory; hidden Markov models; speech synthesis; HMM based speech synthesis; HMM based voices; LAR coefficients; LSP coefficients; MGC coefficients; PARCOR coefficients; automatic modification; automatic variation; filter coefficients; hyperarticulated recordings; hyperarticulation; hypoarticulated recordings; neutral speech synthesizer; neutral style; source speaker; speaker identity preservation; speaking style transposition approach; standard neutral voice; statistical methods; statistical transformation method; statistical transformations; Adaptation models; Data models; Databases; Hidden Markov models; Speech; Standards; Transforms; Degree of articulation; HTS; expressive speech; speaking style transposition; speech synthesis; voice quality;
  • fLanguage
    English
  • Journal_Title
    Selected Topics in Signal Processing, IEEE Journal of
  • Publisher
    ieee
  • ISSN
    1932-4553
  • Type

    jour

  • DOI
    10.1109/JSTSP.2014.2302742
  • Filename
    6736083