Title :
Speech recognition modeling advances for mobile voice search
Author :
Bocchieri, Enrico ; Caseiro, Diamantino ; Dimitriadis, Dimitrios
Author_Institution :
AT&T Res., Florham Park, NJ, USA
Abstract :
This paper reports on the development and advances in automatic speech recognition for the AT&T Speak4it® voice-search application. With Speak4it as real-life example, we show the effectiveness of acoustic model (AM) and language model (LM) estimation (adaptation and training) on relatively small amounts of application field-data. We then introduce algorithmic improvements concerning the use of sentence length in LM, of non-contextual features in AM decision-trees, and of the Teager energy in the acoustic front-end. The combination of these algorithms, integrated into the AT&T Watson recognizer, yields substantial accuracy improvements. LM and AM estimation on field-data samples increases the word accuracy from 66.4% to 77.1%, a relative word error reduction of 32%. The algorithmic improvements increase the accuracy to 79.7%, an additional 11.3% relative error reduction.
Keywords :
speech recognition; trees (mathematics); AM decision-trees; AM estimation; LM estimation; Teager energy; acoustic model estimatiopn; language model estimation; mobile voice search; speech recognition modeling; Accuracy; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Speech recognition; Training; HMM; decision tree clustering; speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947451