Title :
Statistical models for automatic language identification
Author :
Li, K.P. ; Edwards, T.J.
Author_Institution :
TRW Defense and Space Systems Group, Redondo Beach, CA
Abstract :
An Automatic Language Identification system simulation has been developed based upon an automatic acoustic-phonetic segmentation of speech. Utilizing six acoustic-phonetic segmentation classes, various finite-state models were developed to distinguish among five different languages. The finite-state models (trained with gathered segmentation language statistics) considered concatenations of individual segments as well as syllable-like strings. No attempt was made to locate syllable boundaries; therefore, the syllable models described either inter-syllable nuclei or intra-syllable nucleus segment statistics. Segmental durations were also included in some models. Language identification results ranged considerably across models, reaching a maximum of 80 percent correct identification for an independent test on 50 talkers (ten talkers per language).
Keywords :
Acoustic testing; Character recognition; Data analysis; Data mining; Natural languages; Probability; Speech processing; Statistics; Training data; Voting;
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '80.
DOI :
10.1109/ICASSP.1980.1170832