DocumentCode
302179
Title
Statistical models for topic identification using phoneme substrings
Author
Wright, Jerry H. ; Carey, Michael J. ; Parris, Eluned S.
Author_Institution
Ensigma Ltd., Chepstow, UK
Volume
1
fYear
1996
fDate
7-10 May 1996
Firstpage
307
Abstract
Phoneme substrings that are recurrent within training data are detected and logged using dynamic programming procedures. The resulting keystrings (cluster centroids) are awarded a usefulness rating based on smoothed occurrence probabilities in wanted and unwanted data. The rankings of the keystrings by usefulness measured on training, development test and final test data for three language-pairs from the OGI multi-language corpus are highly consistent, showing that language-specific features are being found. Statistical measures of local association also suggest that keystring occurrences can be correlated in a manner similar to that of keywords for a particular topic. With improved recognition accuracy it should be possible to exploit this information in order to enhance performance in topic identification
Keywords
correlation methods; dynamic programming; probability; smoothing methods; speech processing; speech recognition; statistical analysis; OGI multilanguage corpus; cluster centroids; correlation; development test; dynamic programming; keystrings; language-pairs; local association; performance; phoneme substrings; recognition accuracy; smoothed occurrence probabilities; statistical measures; statistical models; test data; topic identification; training data; usefulness rating; Cepstral analysis; Dynamic programming; Filter bank; Hidden Markov models; Mathematics; Parameter estimation; Speech recognition; Testing; Training data; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location
Atlanta, GA
ISSN
1520-6149
Print_ISBN
0-7803-3192-3
Type
conf
DOI
10.1109/ICASSP.1996.540419
Filename
540419
Link To Document