Title : 
A grammatical approach to reducing the statistical sparsity of language models in natural domains
         
        
            Author : 
English, Thomas M. ; Boggess, Lois C.
         
        
            Author_Institution : 
Mississppi State University, Mississppi State, MS, USA
         
        
        
        
        
        
        
            Abstract : 
Network models of natural language grow large and sparse while failing to predict many subsequent inputs. A syntax-directed speech recognizer cannot correctly transcribe a sentence for which no network path exists. The sparsity and size of a network may be reduced by partitioning the vocabulary into primary and secondary vocabularies on the basis of word frequency. Sentences with secondary phrases replaced by a placeholder are used to build a network. Secondary phrases grouped according to which primary words immediately precede and follow them are used to build lower-level networks. The groups of phrases constitute crude grammatical categories. Preliminary study suggests the efficacy of the approach.
         
        
            Keywords : 
Computer science; Error correction; Intelligent networks; Natural languages; Predictive models; Speech recognition; Vocabulary;
         
        
        
        
            Conference_Titel : 
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86.
         
        
        
            DOI : 
10.1109/ICASSP.1986.1168955