• DocumentCode
    417152
  • Title

    Vocabulary-independent search in spontaneous speech

  • Author

    Seide, Frank ; Yu, Peng ; Ma, Chengyuan ; Chang, Eric

  • Author_Institution
    Microsoft Res. Asia, Beijing, China
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    For efficient organization of speech recordings - meetings, interviews, voice mails, lectures - the ability to search for spoken keywords is an essential capability. Today, most spoken-document retrieval systems use large-vocabulary recognition. For the above scenarios, such systems suffer from both the unpredictable vocabulary/domain and generally high word-error rates (WER). We present a vocabulary-independent system to index and to search rapidly spontaneous speech. A speech recognizer generates lattices of phonetic word fragments, against which keywords are matched phonetically. We first show the need to use recognition alternatives (lattices) in a high-WER context, on a word-based baseline. Then we introduce our new method of phonetic word-fragment lattice generation, which uses longer-span language knowledge than a phoneme recognizer. Last we introduce heuristics to compact the lattices to feasible sizes that can be searched efficiently. On the LDC voice mail corpus, we show that vocabulary/domain-independent phonetic search is as accurate as a vocabulary/domain-dependent word-lattice based baseline system for in-vocabulary keywords (FOMs of 74-75%), but nearly maintains this accuracy also for out-of-vocabulary keywords.
  • Keywords
    error statistics; query processing; speech processing; speech recognition; large-vocabulary recognition; phonetic word fragment lattice generation; speech recognizer; speech recordings; spoken keywords; spoken-document retrieval systems; spontaneous speech; vocabulary-independent search; word-error rates; Asia; Audio recording; Broadcasting; Humans; Indexing; Information retrieval; Lattices; Speech recognition; Vocabulary; Voice mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1325970
  • Filename
    1325970