Title :
With LSA Size DOES Matter
Author_Institution :
Fac. of ICT, Dept. of CIS, Univ. of Malta, Msida, Malta
Abstract :
Latent Semantic Analysis (LSA) is a technique from the field of Natural Language Processing that enables comparison of semantic similarities between documents using vector operations. This technique has been used in areas from Information Retrieval (IR) to the automated assessment of essays. One property used in document comparison is size. The general philosophy is that more text is better although few concrete examples or guidelines exist that demonstrate this. This paper shows, via a novel concrete example taken from real world data, that larger documents do imply more accurate semantic similarity comparisons.
Keywords :
document handling; information retrieval; natural language processing; IR; Information Retrieval; LSA size; document comparison; latent semantic analysis; natural language processing; semantic similarities; vector operations; Correlation; Educational institutions; Information retrieval; Learning systems; Matrix decomposition; Semantics; Vectors; LSA; NLP; automated essay assessment; document length; latent semantic analysis; natural language processing;
Conference_Titel :
Computer Modeling and Simulation (EMS), 2012 Sixth UKSim/AMSS European Symposium on
Conference_Location :
Valetta
Print_ISBN :
978-1-4673-4977-2
DOI :
10.1109/EMS.2012.24