DocumentCode
144997
Title
GMM as an alternative to HMM in the search for the optimal warping factor for VTLN
Author
Mayor Martins, Ramon ; Ynoguti, Carlos Alberto
Author_Institution
Inst. Nac. de Telecomun., Santa Rita do Sapucaí, Brazil
fYear
2014
fDate
17-20 Aug. 2014
Firstpage
1
Lastpage
4
Abstract
The acoustic variability among speakers is one of the factors that drops the performance of automatic speech recognition systems, and the Vocal Tract Length Normalization (VTLN) technique is one of the most used technique used to cope with this variability. In this technique, a warping factor is applied to the mel filterbank in order to normalize the vocal tract length of all speakers. Traditionally, the warping factor is found by looking for the one that leads to the better performance, in a sweeping process. Usually, an HMM is used for this purpose, but in this paper, we propose the use of a GMM as an alternative in the search for the optimal warping factor, which use can avoid the phonetic transcription needed by the HMM approach. The tests show that both approaches have similar performance either in WER and computational cost.
Keywords
Gaussian processes; mixture models; speaker recognition; GMM; VTLN; automatic speech recognition systems; mel filterbank; optimal warping factor; vocal tract length normalization technique; Acoustics; Computational efficiency; Computational modeling; Hidden Markov models; Speech; Speech recognition; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Telecommunications Symposium (ITS), 2014 International
Conference_Location
Sao Paulo
Type
conf
DOI
10.1109/ITS.2014.6948003
Filename
6948003
Link To Document