DocumentCode
2052182
Title
Generalizing Latent Semantic Analysis
Author
Olney, Andrew M.
Author_Institution
Inst. for Intell. Syst., Univ. of Memphis, Memphis, TN, USA
fYear
2009
fDate
14-16 Sept. 2009
Firstpage
40
Lastpage
46
Abstract
Latent semantic analysis (LSA) is a vector space technique for representing word meaning. Traditionally, LSA consists of two steps, the formation of a word by document matrix followed by singular value decomposition of that matrix. However, the formation of the matrix according to the dimensions of words and documents is somewhat arbitrary. This paper attempts to reconceptualize LSA in more general terms, by characterizing the matrix as a feature by context matrix rather than a word by document matrix. Examples of generalized LSA utilizing n-grams and local context are presented and compared with traditional LSA on paraphrase comparison tasks.
Keywords
matrix algebra; natural language processing; singular value decomposition; text analysis; document matrix; latent semantic analysis; n-grams; singular value decomposition; vector space technique; Dictionaries; Frequency; Functional analysis; Information retrieval; Intelligent systems; Least squares approximation; Matrix decomposition; Singular value decomposition; Sparse matrices; USA Councils; latent semantic analysis; n-gram; paraphrase; vector space;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantic Computing, 2009. ICSC '09. IEEE International Conference on
Conference_Location
Berkeley, CA
Print_ISBN
978-1-4244-4962-0
Electronic_ISBN
978-0-7695-3800-6
Type
conf
DOI
10.1109/ICSC.2009.89
Filename
5298543
Link To Document