DocumentCode :
3630614
Title :
Sub-word modeling of out of vocabulary words in spoken term detection
Author :
Igor Szoke;Lukas Burget;Jan Cernocky;Michal Fapso
Author_Institution :
Speech@FIT, Faculty of information Technology, Brno University of Technology, Czech Republic
fYear :
2008
Firstpage :
273
Lastpage :
276
Abstract :
This paper deals with comparison of sub-word based methods for spoken term detection (STD) task and phone recognition. The sub-word units are needed for search for out-of-vocabulary words. We compared words, phones and multigrams. The maximal length and pruning of multigrams were investigated first. Then two constrained methods of multigram training were proposed. We evaluated on the NIST STD06 dev-set CTS data. The conclusion is that the proposed method improves the phone accuracy more than 9% relative and STD accuracy more than 7% relative.
Keywords :
"Vocabulary","Lattices","Indexing","Information technology","NIST","Speech processing","Broadcasting","Telephony","Speech recognition","Dictionaries"
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop, 2008. SLT 2008. IEEE
Print_ISBN :
978-1-4244-3471-8
Type :
conf
DOI :
10.1109/SLT.2008.4777893
Filename :
4777893
Link To Document :
بازگشت