Title :
GMM as an alternative to HMM in the search for the optimal warping factor for VTLN
Author :
Mayor Martins, Ramon ; Ynoguti, Carlos Alberto
Author_Institution :
Inst. Nac. de Telecomun., Santa Rita do Sapucaí, Brazil
Abstract :
The acoustic variability among speakers is one of the factors that drops the performance of automatic speech recognition systems, and the Vocal Tract Length Normalization (VTLN) technique is one of the most used technique used to cope with this variability. In this technique, a warping factor is applied to the mel filterbank in order to normalize the vocal tract length of all speakers. Traditionally, the warping factor is found by looking for the one that leads to the better performance, in a sweeping process. Usually, an HMM is used for this purpose, but in this paper, we propose the use of a GMM as an alternative in the search for the optimal warping factor, which use can avoid the phonetic transcription needed by the HMM approach. The tests show that both approaches have similar performance either in WER and computational cost.
Keywords :
Gaussian processes; mixture models; speaker recognition; GMM; VTLN; automatic speech recognition systems; mel filterbank; optimal warping factor; vocal tract length normalization technique; Acoustics; Computational efficiency; Computational modeling; Hidden Markov models; Speech; Speech recognition; Vectors;
Conference_Titel :
Telecommunications Symposium (ITS), 2014 International
Conference_Location :
Sao Paulo
DOI :
10.1109/ITS.2014.6948003