Title :
A grammatical approach to reducing the statistical sparsity of language models in natural domains
Author :
English, Thomas M. ; Boggess, Lois C.
Author_Institution :
Mississppi State University, Mississppi State, MS, USA
Abstract :
Network models of natural language grow large and sparse while failing to predict many subsequent inputs. A syntax-directed speech recognizer cannot correctly transcribe a sentence for which no network path exists. The sparsity and size of a network may be reduced by partitioning the vocabulary into primary and secondary vocabularies on the basis of word frequency. Sentences with secondary phrases replaced by a placeholder are used to build a network. Secondary phrases grouped according to which primary words immediately precede and follow them are used to build lower-level networks. The groups of phrases constitute crude grammatical categories. Preliminary study suggests the efficacy of the approach.
Keywords :
Computer science; Error correction; Intelligent networks; Natural languages; Predictive models; Speech recognition; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86.
DOI :
10.1109/ICASSP.1986.1168955