DocumentCode
2580588
Title
Vocal tract length normalization strategy based on maximum likelihood criterion
Author
Jakovljevic, Niksa M. ; Secujski, M.S. ; Delic, V.D.
fYear
2009
fDate
18-23 May 2009
Firstpage
399
Lastpage
402
Abstract
In this paper performances of automatic speech recognition systems which use vocal tract length normalization (VTN) are presented. Beside standard procedure for VTN coefficient estimation several variants based on robust statistic methods are introduced. All systems which use VTN performed better than referent systems, while the best performance was achieved by the system in which the VTN coefficient for a particular speaker is chosen as the one with maximum sample mean of likelihoods per phoneme. Phoneme likelihoods are calculated as sample medians of feature vectors corresponding to particular phonemes. The relative improvement of performance for this system is about 20%.
Keywords
maximum likelihood estimation; speech recognition; automatic speech recognition systems; coefficient estimation; feature vectors; maximum likelihood criterion; particular speaker; phoneme likelihoods; robust statistic methods; vocal tract length normalization strategy; Automatic speech recognition; Hidden Markov models; Loudspeakers; Materials testing; Maximum likelihood estimation; Mel frequency cepstral coefficient; Parameter estimation; Piecewise linear techniques; Robustness; Statistics; speech recognition; vocal tract length normalization;
fLanguage
English
Publisher
ieee
Conference_Titel
EUROCON 2009, EUROCON '09. IEEE
Conference_Location
St.-Petersburg
Print_ISBN
978-1-4244-3860-0
Electronic_ISBN
978-1-4244-3861-7
Type
conf
DOI
10.1109/EURCON.2009.5167662
Filename
5167662
Link To Document