Multi-stream language identification using data-driven dependency selection

Author

Parandekar, Sonia ; Kirchhoff, Katrin

Author_Institution

Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA

Volume

1

fYear

2003

fDate

6-10 April 2003

Abstract

The most widespread approach to automatic language identification in the past has been the statistical modeling of phone sequences extracted from speech signals. Recently, we have developed an alternative approach to LID based on n-gram modeling of parallel streams of articulatory features, which was shown to have advantages over phone-based systems on short test signals whereas the latter achieved a higher accuracy on longer signals. Additionally, phone and feature streams can be combined to achieve maximum performance. Within this "multi-stream" framework two types of statistical dependencies need to be modeled: (a) dependencies between symbols in individual streams and (b) dependencies between symbols in different streams. The space of possible dependencies is typically too large to be searched exhaustively. We explore the use of genetic algorithms as a method for data-driven dependency selection. The result is a general framework for the discovery and modeling of dependencies between multiple information sources expressed as sequences of symbols, which has implications for other fields beyond language identification, such as speaker identification or language modeling.

Keywords

genetic algorithms; grammars; natural languages; speech recognition; statistical analysis; LID; articulatory features; automatic language identification; data-driven dependency selection; feature streams; genetic algorithms; language modeling; multi-stream language identification; multiple information sources; n-gram modeling; parallel streams; phone sequences; phone streams; phone-based systems; short test signals; speaker identification; speech signals; statistical dependencies; statistical modeling; symbol sequences; Context modeling; Genetic algorithms; Hidden Markov models; Loudspeakers; Natural languages; Probability distribution; Signal processing; Speech; System testing; Tongue;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-7663-3

Type

conf

DOI

10.1109/ICASSP.2003.1198708

Filename

1198708