Title :
Robust language identification based on fused phonotactic information with MLKSFM pre-classifier
Author :
Wang, Liang ; Ambikairajah, Eliathamby ; Choi, Eric H C
fDate :
June 28 2009-July 3 2009
Abstract :
In this paper we propose a novel language identification system which utilizes fused phonotactic information. The phase spectrum of speech signals is used with the magnitude spectrum in order to obtain a more robust feature representation. Parallel Broad Phoneclass Recognition followed by Language Model (PBPRLM) is used in order to remove the bias of the likelihood scores introduced by the size inequality of phone inventories in traditional PPRLM systems. The likelihood scores from the MFCC-based and group-delay-based PPRLM and PBPRLM systems are fused together by using a Gaussian Mixture Model. Furthermore, a pre-classification based on Kohonen´s map is used in order to maintain the system robustness while handling a large number of target languages. Using this proposed novel system we achieve an EER of 6.7% on the 2005 NIST LRE, and a LID recognition rate of 83.9% on a 22-language task.
Keywords :
natural language processing; speech recognition; Gaussian mixture model; feature representation; fused phonotactic information; language identification system; language model; magnitude spectrum; parallel broad phone-class recognition; robust language identification; speech signals phase spectrum; Australia; Cepstral analysis; Data mining; Laboratories; NIST; Natural languages; Robustness; Speech recognition; Support vector machines; System performance; Robust language identification; likelihood score bias; phase spectrum; preclassification; score fusion;
Conference_Titel :
Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on
Conference_Location :
New York, NY
Print_ISBN :
978-1-4244-4290-4
Electronic_ISBN :
1945-7871
DOI :
10.1109/ICME.2009.5202451