DocumentCode
3300594
Title
Improving Latent Semantic Indexing with concepts mapping based on domain ontology
Author
Hao, Jingmin ; Liao, Lejian ; Dong, Xiujie
Author_Institution
Sch. of Comput. Sci., Beijing Inst. of Technol., Beijing
fYear
2008
fDate
19-22 Oct. 2008
Firstpage
1
Lastpage
6
Abstract
ldquoCurse of dimensionalityrdquo is a common problem in the area of information retrieval. It was verified that points in a vector space are projected to a random subspace of suitably high dimension, and then the distances between the points are approximately preserved. Although such a random projection can be used to reduce the dimension of the document space, it does not bring together semantically related documents. Latent Semantic Indexing (LSI) projects documents to lower dimensional LSI space from higher dimensional term space with singular-value decomposition (SVD) for the purpose of reducing the dimensions of the document space and bringing together semantically related documents. But the computation time of SVD is a bottleneck because of the higher dimensions of documents. In this paper, a novel method of dimension reduction for improving LSI is provided. A term-to-concept projection matrix based on domain ontology was created in this method. This way documents were projected to lower dimensional concept space by the projection matrix. LSI pre-computation was performed not on the original term by document matrix, but on the lower dimensional concept by document matrix at great computational savings. Experiments indicate that this method improves the efficiency of LSI. And the similarity judgment between documents is not disturbed.
Keywords
information retrieval; ontologies (artificial intelligence); singular value decomposition; concepts mapping; domain ontology; information retrieval; latent semantic indexing; singular-value decomposition; vector space; Computer science; Data mining; Indexing; Information retrieval; Information technology; Laboratories; Large scale integration; Ontologies; Partial response channels; Space technology; LSI; Latent Semantic Indexing; dimension reduction; domain ontology;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-4515-8
Electronic_ISBN
978-1-4244-2780-2
Type
conf
DOI
10.1109/NLPKE.2008.4906768
Filename
4906768
Link To Document