Title :
On-demand new word learning using world wide web
Author :
Oger, Stanislas ; Linarès, Georges ; Béchet, Frédéric ; Nocera, Pascal
fDate :
March 31 2008-April 4 2008
Abstract :
Most of the Web-based methods for lexicon augmenting consist in capturing global semantic features of the targeted domain in order to collect relevant documents from the Web. We suggest that the local context of the out-of-vocabulary (OOV) words contains relevant information on the OOV words. With this information, we propose to use the Web to build locally-augmented lexicons which are used in a final local decoding pass. Our experiments confirm the relevance of the Web for the OOV word retrieval. Different methods are proposed to retrieve the hypothesis words. Finally we present the integration of new words in the transcription process based on part-of-speech models. This technique allows to recover 7.6% of the significant OOV words and the accuracy of the system is improved.
Keywords :
information retrieval; semantic Web; word processing; Web-based methods; World Wide Web; global semantic features; out-of-vocabulary word retrieval; transcription process; word learning; Automatic speech recognition; Broadcasting; Decoding; Dictionaries; Hidden Markov models; Information retrieval; Natural languages; Speech recognition; Vocabulary; Web sites; Information retrieval; Lexical modeling; Natural languages; Speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518607