Language identification using parallel syllable-like unit recognition

Author

Nagarajan, T. ; Murthy, Hema A.

Author_Institution

Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Madras, India

Volume

1

fYear

2004

fDate

17-21 May 2004

Abstract

Automatic spoken language identification (LID) is the task of identifying the language from a short utterance of the speech signal. The most successful approach to LID uses phone recognizers of several languages in parallel. The basic requirement to build a parallel phone recognition (PPR) system is annotated corpora. A novel approach is proposed for the LID task which uses parallel syllable-like unit recognizers, in a framework similar to the PPR approach in the literature. The difference is that unsupervised syllable models are built from the training data. The data is first segmented into syllable-like units. The syllable segments are then clustered using an incremental approach. This results in a set of syllable models for each language. Our initial results on the OGI MLTS corpora show that the performance is 69.5%. We further show that if only a subset of syllable models that are unique (in some sense), are considered, the performance improves to 75.9%.

Keywords

learning (artificial intelligence); natural languages; parallel processing; speech recognition; LID; annotated corpora; automatic language identification; parallel phone recognition system; parallel syllable-like unit recognition; speech signal; spoken language identification; training data; unsupervised syllable models; Automatic speech recognition; Computer science; Frequency; Humans; Information resources; Natural languages; Performance analysis; Signal processing; Testing; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1326007

Filename

1326007