• DocumentCode
    302179
  • Title

    Statistical models for topic identification using phoneme substrings

  • Author

    Wright, Jerry H. ; Carey, Michael J. ; Parris, Eluned S.

  • Author_Institution
    Ensigma Ltd., Chepstow, UK
  • Volume
    1
  • fYear
    1996
  • fDate
    7-10 May 1996
  • Firstpage
    307
  • Abstract
    Phoneme substrings that are recurrent within training data are detected and logged using dynamic programming procedures. The resulting keystrings (cluster centroids) are awarded a usefulness rating based on smoothed occurrence probabilities in wanted and unwanted data. The rankings of the keystrings by usefulness measured on training, development test and final test data for three language-pairs from the OGI multi-language corpus are highly consistent, showing that language-specific features are being found. Statistical measures of local association also suggest that keystring occurrences can be correlated in a manner similar to that of keywords for a particular topic. With improved recognition accuracy it should be possible to exploit this information in order to enhance performance in topic identification
  • Keywords
    correlation methods; dynamic programming; probability; smoothing methods; speech processing; speech recognition; statistical analysis; OGI multilanguage corpus; cluster centroids; correlation; development test; dynamic programming; keystrings; language-pairs; local association; performance; phoneme substrings; recognition accuracy; smoothed occurrence probabilities; statistical measures; statistical models; test data; topic identification; training data; usefulness rating; Cepstral analysis; Dynamic programming; Filter bank; Hidden Markov models; Mathematics; Parameter estimation; Speech recognition; Testing; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
  • Conference_Location
    Atlanta, GA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-3192-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1996.540419
  • Filename
    540419