• DocumentCode
    311039
  • Title

    Approaches to phoneme-based topic spotting: an experimental comparison

  • Author

    Kuhn, Roland ; Nowell, Peter ; Drouin, Caroline

  • Author_Institution
    Speech Technol. Lab., Panasonic Technol. Inc., Santa Barbara, CA, USA
  • Volume
    3
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    1819
  • Abstract
    Topic spotting is often performed on the output of a large vocabulary recognizer or a keyword spotter. However, this requires detailed knowledge about the vocabulary, and transcribed training data. If portability to new topics and languages is important, then a topic spotter based on phoneme recognition is preferable. A phoneme recognizer is run on training data consisting of audio files labeled by topic alone-no word transcripts are required. Phoneme sub-sequences which help to predict the topic are then extracted automatically. The work described was carried out by two teams exploring three very different approaches to phoneme-based topic spotting: the “DP-ngram”, the “decision tree”, and the “Euclidean” approach. Results obtained by each team on the ARM (Airborne Reconnaissance Mission) and Switchboard data sets were compared by means of receiver operating characteristic (ROC) curves. The best performance for each team was obtained via a similar type of discriminative training
  • Keywords
    decision theory; dynamic programming; grammars; speech processing; speech recognition; trees (mathematics); Airborne Reconnaissance Mission data set; DP-ngram; Euclidean approach; Switchboard data set; audio files; decision tree; experimental comparison; keyword spotter; language portability; large vocabulary recognizer; performance; phoneme based topic spotting; phoneme recognition; phoneme recognizer; phoneme subsequences; receiver operating characteristic curves; topic portability; training data; transcribed training data; Clustering algorithms; Data mining; Dynamic programming; Educational institutions; Frequency; Heuristic algorithms; Laboratories; Reconnaissance; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.598890
  • Filename
    598890