DocumentCode
2330001
Title
An efficient approach for two-stage open vocabulary spoken term detection
Author
Norouzian, Atta ; Rose, Richard
Author_Institution
Dept. of ECE, McGill Univ., Montreal, QC, Canada
fYear
2010
fDate
12-15 Dec. 2010
Firstpage
194
Lastpage
199
Abstract
This paper investigates indexing strategies for open vocabulary spoken term detection (STD) in a lecture speech domain. STD is performed from word lattices generated offline using an automatic speech recognition (ASR) system configured from a meetings task domain. Indexing of lattice paths is performed to avoid exhaustive search of audio segments which can be impractical for extremely large media repositories. The method is based on constructing a word-based index from these lattices and using an approximate subword-based algorithm for accessing index entries from subword expansions of query terms. Results are presented for an experimental study demonstrating both STD performance and the potential for scaling the indexing strategy to very large collections of audio segments.
Keywords
indexing; speech processing; speech recognition; word processing; approximate subword based algorithm; audio segment; automatic speech recognition system; lattice path indexing; lecture speech domain; open vocabulary spoken term detection; two stage open vocabulary spoken term detection; word based index; Speech recognition; spoken term detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location
Berkeley, CA
Print_ISBN
978-1-4244-7904-7
Electronic_ISBN
978-1-4244-7902-3
Type
conf
DOI
10.1109/SLT.2010.5700850
Filename
5700850
Link To Document