DocumentCode :
3531333
Title :
Efficient subword lattice retrieval for German spoken term detection
Author :
Mertens, Timo ; Schneider, Daniel
Author_Institution :
Dept. of Electron. & Telecommun., NTNU, Trondheim
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
4885
Lastpage :
4888
Abstract :
We present a lattice-based STD method for German broadcast news data and compare it to a previously proposed fuzzy search. Due to the important out-of-vocabulary (OOV) problem in German, we evaluate suitable subword indexing units for lattice retrieval. Hybrid lattice retrieval of words and subwords is investigated because of the robust nature of words as an indexing unit. We show that by using efficient lattice graph and score pruning techniques, precision of subword retrieval is increased by 8% absolute with only a small loss in recall. Additionally, a speed-up of up to 6 times can be observed.
Keywords :
fuzzy set theory; graph theory; indexing; information retrieval; natural language processing; speech processing; vocabulary; German spoken term detection; fuzzy search; lattice graph; lattice-based STD method; out-of-vocabulary; score pruning techniques; subword indexing; subword lattice retrieval; Broadcasting; Error analysis; Indexing; Lattices; Morphology; Natural languages; Robustness; Speech recognition; Testing; Vocabulary; speech recognition; speech search; spoken document retrieval; spoken term detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960726
Filename :
4960726
Link To Document :
بازگشت