مرکز منطقه ای اطلاع رساني علوم و فناوري - An integrated language identification for code- switched speech using decoded-phonemes and support vector machine

DocumentCode :

653734

Title :

An integrated language identification for code- switched speech using decoded-phonemes and support vector machine

Author :

Mabokela, Koena Ronny ; Manamela, Madimetja Jonas

Author_Institution :

Dept. of Comput. Sci., Univ. of Limpopo, Polokwane, South Africa

fYear :

2013

fDate :

16-19 Oct. 2013

Firstpage :

Lastpage :

Abstract :

Automatic language identification (LID) is a specialized area of Human Language Technology in which the language(s) used in spoken utterances are identified and correctly classified given a predetermined number of targeted languages. Currently, most multilingual speakers have the ability and tendency for engaging in code-switching - a mixed-language phenomenon that is referred to as the usage of more than one language in utterances. This paper presents the proposed scheme for automatic language identification integrated with an automatic speech recognition system to identify languages used in a mixed-language speech context. The front-end speech recognition system feeds the decoded phonemes into the LID system. We used hidden Markov models to build acoustic models of a combined phoneme set that handles multiple languages within an utterance. A spoken utterance is converted into feature vectors with attributes that represents the statistical occurrences of each acoustic units. A supervised support vector machine (SVM) technique is trained with feature vector sequences of phoneme units. The back-end SVM classifier based on n-gram structures is used to classify/identify the phoneme feature vectors. We conducted experiments with two commonly mixed Northern Sotho and English telephone-based speech corpora. The experimental results showed that, by using shared phonemic vowels in the combined phoneme set, the word error rate (WER) was reduced with 3.6%. Moreover, the proposed approach yields significantly acceptable performance with language identification rate of 85.0% on code-switched speech corpus.

Keywords :

hidden Markov models; learning (artificial intelligence); natural language processing; signal classification; speaker recognition; speech coding; statistical analysis; support vector machines; English telephone-based speech corpora; LID system; Northern Sotho telephone-based speech corpora; WER; acoustic models; automatic language identification; automatic speech recognition system; back-end SVM classifier; code-switched speech; code-switched speech corpus; decoded-phonemes; feature vector sequences; hidden Markov models; human language technology; integrated language identification; mixed-language phenomenon; mixed-language speech context; multilingual speakers; n-gram structures; spoken utterances; statistical occurrences; supervised support vector machine; word error rate; Dictionaries; Feature extraction; Hidden Markov models; Speech; Speech coding; Speech recognition; Support vector machine classification; code-switching; decoded phonemes; language identification; n-gram models; speech recognition; support vector machine;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Speech Technology and Human - Computer Dialogue (SpeD), 2013 7th Conference on

Conference_Location :

Cluj-Napoca

Type :

conf

DOI :

10.1109/SpeD.2013.6682661

Filename :

6682661

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=653734