Title :
Combined waveform-cepstral representation for robust speech recognition
Author :
Matthew Ager;Zoran Cvetković;Peter Sollich
Author_Institution :
Department of Mathematics, King´s College London, UK
fDate :
7/1/2011 12:00:00 AM
Abstract :
High-dimensional acoustic waveform representations are studied as a front-end for noise robust automatic speech recognition using generative methods, in particular Gaussian mixture models and hidden Markov models. The proposed representations are compared with standard cepstral features on phoneme classification and recognition tasks. While lower error rates are achieved using cepstral features at very low noise levels, the acoustic waveform representations are much more robust to noise. A convex combination of acoustic waveforms and cepstral features is then considered and it achieves higher accuracy than either of the individual representations across all noise levels.
Keywords :
"Speech recognition","Speech","Hidden Markov models","Noise","Mel frequency cepstral coefficient"
Conference_Titel :
Information Theory Proceedings (ISIT), 2011 IEEE International Symposium on
Print_ISBN :
978-1-4577-0596-0
Electronic_ISBN :
2157-8117
DOI :
10.1109/ISIT.2011.6034260