Discriminative Classifiers for Language Recognition

Author

White, Christopher ; Shafran, Zhak ; Gauvain, Jean-Luc

Author_Institution

Center for Language & Speech Process., JHU, Baltimore, MD

Volume

1

fYear

2006

fDate

14-19 May 2006

Abstract

Most language recognition systems consist of a cascade of three stages: (1) tokenizers that produce parallel phone streams, (2) phonotactic models that score the match between each phone stream and the phonotactic constraints in the target language, and (3) a final stage that combines the scores from the parallel streams appropriately (M.A. Zissman, 1996). This paper reports a series of contrastive experiments to assess the impact of replacing the second and third stages with large-margin discriminative classifiers. In addition, it investigates how sounds that are not represented in the tokenizers of the first stage can be approximated with composite units that utilize cross-stream dependencies obtained via multi-string alignments. This leads to a discriminative framework that can potentially incorporate a richer set of features such as prosodic and lexical cues. Experiments are reported on the NIST LRE 1996 and 2003 task and the results show that the new techniques give substantial gains over a competitive PPRLM baseline

Keywords

natural languages; speech recognition; support vector machines; cross-stream dependencies; language recognition; large-margin discriminative classifiers; multistring alignments; support vector machines; Lattices; NIST; Natural languages; Neural networks; Performance evaluation; Speaker recognition; Speech processing; Speech recognition; Target recognition; Testing;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on

Conference_Location

Toulouse

ISSN

1520-6149

Print_ISBN

1-4244-0469-X

Type

conf

DOI

10.1109/ICASSP.2006.1659995

Filename

1659995