A hybrid physical and statistical dynamic articulatory framework incorporating analysis-by-synthesis for improved phone classification

Author

Al Bawab, Ziad ; Raj, Bhiksha ; Stern, Richard M.

Author_Institution

Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear

2010

fDate

14-19 March 2010

Firstpage

4194

Lastpage

4197

Abstract

In this paper, we present a dynamic articulatory model for phone classification. The model integrates real articulatory information derived from ElectroMagnetic Articulograph (EMA) data into its inner states. It maps from the articulatory space to the acoustic one using an adapted vocal tract model for each speaker and a physiologically-motivated articulatory synthesis approach. We apply the analysis-by-synthesis paradigm in a statistical fashion. We first present a fast approach for deriving analysis-by-synthesis distortion features. Next, the distortion between the speech synthesized from the articulatory states and the incoming speech signal is used to compute the output observation probabilities of the Hidden Markov Model (HMM) used for classification. Experiments with the novel framework show improvements over baseline in phone classification accuracy.

Keywords

hidden Markov models; speech recognition; speech synthesis; analysis-by-synthesis distortion features; automatic speech recognition systems; electromagnetic articulograph; hidden Markov model; phone classification; physical dynamic articulatory framework; physiologically-motivated articulatory synthesis; statistical dynamic articulatory framework; Acoustic distortion; Automatic speech recognition; Hidden Markov models; Loudspeakers; Shape measurement; Signal synthesis; Solid modeling; Space technology; Speech recognition; Speech synthesis; Dynamic articulatory modeling; analysis-by-synthesis; articulatory synthesis for recognition; hybrid physical; physical model of the vocal tract; statistical models for classification;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location

Dallas, TX

ISSN

1520-6149

Print_ISBN

978-1-4244-4295-9

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2010.5495696

Filename

5495696