DocumentCode
2018241
Title
Data-driven lexicon refinement using local and web resources for Chinese speech recognition
Author
Zhang, Hua ; Zhu, Xuan ; Su, Teng-Rong ; Eom, Ki-Wan ; Lee, Jae-Won
Author_Institution
China Samsung Telecom R&D Center, Samsung Electron., Beijing, China
fYear
2010
fDate
Nov. 29 2010-Dec. 3 2010
Firstpage
233
Lastpage
237
Abstract
This paper proposes a data-driven lexicon refinement method. By expanding and polishing lexicon using local and web resources, accuracy of Chinese automatic speech recognition (ASR) system is boosted effectively. The proposed lexicon refining process is composed of two steps. First, an improved intra-word measure is introduced. It helps to expand lexicon from local text corpora. Second, the expanded lexicon is polished by enumerating the popularity of appended words based on web query results via search engine. The evaluation experiments are carried out on an application of voice-enabled tourist information query system. Experimental results show that the proposed lexicon refinement method reduces character error rate (CER) by 7.9% relatively.
Keywords
Internet; search engines; speech recognition; ASR; CER; Chinese speech recognition; automatic speech recognition; character error rate; data driven lexicon refinement; expanding lexicon; intra word measurement; lexicon refining process; local resources; polishing lexicon; search engine; voice enabled tourist information query system; web query; web resources; bi-gram measure; lexicon refinement; speech recognition; web resources;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location
Tainan
Print_ISBN
978-1-4244-6244-5
Type
conf
DOI
10.1109/ISCSLP.2010.5684905
Filename
5684905
Link To Document