DocumentCode :
179360
Title :
Abin-based ontological framework for low-resourcen-gram smoothing in language modelling
Author :
Benahmed, Y. ; Selouani, Sid-Ahmed ; O´Shaughnessy, D.
Author_Institution :
INRS-EMT, Montréal, QC, Canada
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
4918
Lastpage :
4922
Abstract :
In this paper, we introduce a novel method of smoothing language models (LM) based on the semantic information found in ontologies that is especially adapted for limited-resources language modeling. We exploit the latent knowledge of language that is deeply encoded within ontologies. As such, this work examines the potential of using the semantic and syntactic relations between words from the WordNet ontology to generate new plausible contexts for unseen events to simulate a larger corpus. These unseen events are then mixed-up with a baseline Witten-Bell(WB) LM in order to improve its performance both in terms of language model perplexity and automatic speech recognition word error rates. Results indicate a significant reduction in the perplexity of the language model (up to 9.85% relative) all the while reducing word error rate in a statistically significant manner compared to both the original WB LM and baseline Kneser-Ney smoothed language model on the Wall Street Journal-based Continuous Speech Recognition Phase II corpus.
Keywords :
natural language processing; ontologies (artificial intelligence); speech recognition; LM smoothing; Wall Street Journal; WordNet ontology; automatic speech recognition word error rates; baseline WB LM; baseline Witten-Bell LM; bin-based ontological framework; continuous speech recognition phase II corpus; in ontologies; language model perplexity; language model smoothing; limited-resources language modeling; low-resource N-gram smoothing; plausible context generation; semantic relation; syntactic relation; unseen events; word error rate reduction; Automatic speech recognition; Computational modeling; Ontologies; Optimization; Smoothing methods; Speech; Language modeling; context modeling; low-resource speech recognition; ontologies;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6854537
Filename :
6854537
Link To Document :
بازگشت