Title :
Open-vocabulary keyword detection from super-large scale speech database
Author :
Kanda, Naoyuki ; Sagawa, Hirohiko ; Sumiyoshi, Takashi ; Obuchi, Yasunari
Author_Institution :
Central Res. Lab., Hitachi Ltd., Kokubunji
Abstract :
This paper presents our recent attempt to make a super-large scale spoken-term detection system, which can detect any keyword uttered in a 2,000-hour speech database within a few seconds. There are three problems to achieve such a system. The system must be able to detect out-of-vocabulary (OOV) terms (OOV problem). The system has to respond to the user quickly without sacrificing search accuracy (search speed and accuracy problem). The pre-stored index database should be sufficiently small (index size problem). We introduced a phoneme-based search method to detect the OOV terms, and combined it with the LVCSR-based method. To search for a keyword from large-scale speech databases accurately and quickly, we introduced a multistage rescoring strategy which uses several search methods to reduce the search space in a stepwise fashion. Furthermore, we constructed an out-of-vocabulary/in-vocabulary region classifier, which allows us to reduce the size of the index database for OOVs. We describe the prototype system and present some evaluation results.
Keywords :
object detection; search problems; speech processing; open-vocabulary keyword detection; out-of-vocabulary detection; phoneme-based search method; pre-stored index database; super-large scale speech database; super-large scale spoken-term detection system; Data mining; Databases; Humans; Indexes; Laboratories; Large-scale systems; Prototypes; Search methods; Speech recognition; TV;
Conference_Titel :
Multimedia Signal Processing, 2008 IEEE 10th Workshop on
Conference_Location :
Cairns, Qld
Print_ISBN :
978-1-4244-2294-4
Electronic_ISBN :
978-1-4244-2295-1
DOI :
10.1109/MMSP.2008.4665209