Title :
A comparisonal study of the multi-layer Kohonen self-organizing feature maps for spoken language identification
Author :
Wang, Liang ; Ambikairajah, Eliathamby ; Choi, Eric H C
Abstract :
Our previous research indicates that the multi-layer Kohonen self-organizing feature map (MLKSFM) gives a promising performance for spoken language identification (LID). In this paper, we enhance this approach in two distinct ways. Firstly, by considering the phase information, we propose a new type of feature vector which combines the modified group delay function (MODGDF) and the traditional MFCC. Secondly, we propose a hierarchical structure of the MLKSFM, in which the pre-classification is performed at the lower level MLKSFM and the final language identification is performed at the top level MLKSFM. For the OGI-TS speech corpus, the best LID rate is achieved at 87.3% for the 45-sec test speech utterances by using the hierarchical MLKSFM with 4 classes pre-classified at the lower level MLKSFM. For the 10-sec test speech utterances, the best LID rated is achieved at 60.0% by using the non-hierarchical MLKSFM LID system.
Keywords :
natural languages; self-organising feature maps; signal classification; speech processing; speech recognition; feature vector; modified group delay function; multilayer Kohonen self-organizing feature map; signal classification; speech corpus; speech utterance; spoken language identification; Australia; Delay; Fourier transforms; Labeling; Laboratories; Mel frequency cepstral coefficient; Natural languages; Neural networks; Speech; Training data; Language identification; hierarchical multi-layer Kohonen self-organizing feature map; modified group delay function;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
DOI :
10.1109/ASRU.2007.4430146