DocumentCode
703255
Title
Identification of spoken European languages
Author
Caseiro, Diamantino ; Trancoso, Isabel
Author_Institution
INESC/IST, INESC, Lisbon, Portugal
fYear
1998
fDate
8-11 Sept. 1998
Firstpage
1
Lastpage
4
Abstract
Automatic spoken language identification is the problem of identifying the language being spoken from a sample of speech by an unknown speaker. In this paper we studied the problem of language identification in the context of the European languages, which allowed us to study the effect of language proximity in Indo-European languages. The results reveal a significant impact on the identification of some languages. Current language identification systems vary in their complexity. The systems that use higher level information have the best performance. Nevertheless, that information is hard to collect for each new language. The system presented in this work is easily extendable to new languages because it uses very little linguistic information. In fact, the presented system needs only one language specific phone recogniser (in our case the Portuguese one), and is trained with speech from each of the other languages. With the SpeechDat-M corpus, with 6 European languages (English, French German, Italian, Portuguese and Spanish) our system achieved an identification rate of about 79% on 5-second utterances.
Keywords
natural language processing; speaker recognition; English language; French language; German language; Indo-European languages; Italian language; Portuguese languages; Spanish language; SpeechDat-M corpus; automatic spoken European language identification system; language proximity; language specific phone recogniser; linguistic information; speaker identification; Computational modeling; Computer architecture; Europe; Hidden Markov models; Pragmatics; Speech; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference (EUSIPCO 1998), 9th European
Conference_Location
Rhodes
Print_ISBN
978-960-7620-06-4
Type
conf
Filename
7089726
Link To Document