DocumentCode
394372
Title
Variational inference and learning for segmental switching state space models of hidden speech dynamics
Author
Lee, Leo J. ; Attias, Hagai ; Deng, La
Author_Institution
Dept. of Electr. & Comput. Eng., Waterloo Univ., Ont., Canada
Volume
1
fYear
2003
fDate
6-10 April 2003
Abstract
This paper describes novel and powerful variational EM algorithms for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production. Hidden dynamic models (HDMs) have recently become a class of promising acoustic models to incorporate crucial speech-specific knowledge and overcome many inherent weaknesses of traditional HMMs. However, the lack of powerful and efficient statistical learning algorithms is one of the main obstacles preventing them from being well studied and widely used. Since exact inference and learning are intractable, a variational approach is taken to develop effective approximate algorithms. We have implemented the segmental constraint crucial for modeling speech dynamics and present algorithms for recovering hidden speech dynamics and discrete speech units from acoustic data only. The effectiveness of the algorithms developed are verified by experiments on simulation and Switchboard speech data.
Keywords
acoustic signal processing; belief networks; inference mechanisms; learning (artificial intelligence); optimisation; speech processing; speech recognition; state-space methods; variational techniques; Bayesian network; HMM; Switchboard speech data; acoustic data; acoustic models; approximate algorithms; discrete speech units; efficient statistical learning algorithms; hidden dynamic models; hidden speech dynamics; natural speech production; segmental constraint; segmental switching state space models; simulation; speech dynamics modeling; speech-specific knowledge; variational EM algorithms; variational inference; variational learning; Acoustic testing; Equations; Hidden Markov models; Humans; Inference algorithms; Machine learning algorithms; Natural languages; Speech enhancement; Speech recognition; State-space methods;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-7663-3
Type
conf
DOI
10.1109/ICASSP.2003.1198920
Filename
1198920
Link To Document