Title :
A word prediction methodology for automatic sentence completion
Author :
Spiccia, Carmelo ; Augello, Agnese ; Pilato, Giovanni ; Vassallo, Giorgio
Author_Institution :
Ist. di Calcolo e Reti ad Alte Prestazioni (ICAR), Palermo, Italy
Abstract :
Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network language models.
Keywords :
computational linguistics; matrix algebra; natural language processing; statistical analysis; text analysis; asymmetric word-word frequency matrix; automatic sentence completion; latent semantic analysis; n-grams occurrence statistics; word prediction methodology; Accuracy; Semantics;
Conference_Titel :
Semantic Computing (ICSC), 2015 IEEE International Conference on
Conference_Location :
Anaheim, CA
DOI :
10.1109/ICOSC.2015.7050813