• DocumentCode
    2801439
  • Title

    A hybrid physical and statistical dynamic articulatory framework incorporating analysis-by-synthesis for improved phone classification

  • Author

    Al Bawab, Ziad ; Raj, Bhiksha ; Stern, Richard M.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4194
  • Lastpage
    4197
  • Abstract
    In this paper, we present a dynamic articulatory model for phone classification. The model integrates real articulatory information derived from ElectroMagnetic Articulograph (EMA) data into its inner states. It maps from the articulatory space to the acoustic one using an adapted vocal tract model for each speaker and a physiologically-motivated articulatory synthesis approach. We apply the analysis-by-synthesis paradigm in a statistical fashion. We first present a fast approach for deriving analysis-by-synthesis distortion features. Next, the distortion between the speech synthesized from the articulatory states and the incoming speech signal is used to compute the output observation probabilities of the Hidden Markov Model (HMM) used for classification. Experiments with the novel framework show improvements over baseline in phone classification accuracy.
  • Keywords
    hidden Markov models; speech recognition; speech synthesis; analysis-by-synthesis distortion features; automatic speech recognition systems; electromagnetic articulograph; hidden Markov model; phone classification; physical dynamic articulatory framework; physiologically-motivated articulatory synthesis; statistical dynamic articulatory framework; Acoustic distortion; Automatic speech recognition; Hidden Markov models; Loudspeakers; Shape measurement; Signal synthesis; Solid modeling; Space technology; Speech recognition; Speech synthesis; Dynamic articulatory modeling; analysis-by-synthesis; articulatory synthesis for recognition; hybrid physical; physical model of the vocal tract; statistical models for classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495696
  • Filename
    5495696