DocumentCode :
3300594
Title :
Improving Latent Semantic Indexing with concepts mapping based on domain ontology
Author :
Hao, Jingmin ; Liao, Lejian ; Dong, Xiujie
Author_Institution :
Sch. of Comput. Sci., Beijing Inst. of Technol., Beijing
fYear :
2008
fDate :
19-22 Oct. 2008
Firstpage :
1
Lastpage :
6
Abstract :
ldquoCurse of dimensionalityrdquo is a common problem in the area of information retrieval. It was verified that points in a vector space are projected to a random subspace of suitably high dimension, and then the distances between the points are approximately preserved. Although such a random projection can be used to reduce the dimension of the document space, it does not bring together semantically related documents. Latent Semantic Indexing (LSI) projects documents to lower dimensional LSI space from higher dimensional term space with singular-value decomposition (SVD) for the purpose of reducing the dimensions of the document space and bringing together semantically related documents. But the computation time of SVD is a bottleneck because of the higher dimensions of documents. In this paper, a novel method of dimension reduction for improving LSI is provided. A term-to-concept projection matrix based on domain ontology was created in this method. This way documents were projected to lower dimensional concept space by the projection matrix. LSI pre-computation was performed not on the original term by document matrix, but on the lower dimensional concept by document matrix at great computational savings. Experiments indicate that this method improves the efficiency of LSI. And the similarity judgment between documents is not disturbed.
Keywords :
information retrieval; ontologies (artificial intelligence); singular value decomposition; concepts mapping; domain ontology; information retrieval; latent semantic indexing; singular-value decomposition; vector space; Computer science; Data mining; Indexing; Information retrieval; Information technology; Laboratories; Large scale integration; Ontologies; Partial response channels; Space technology; LSI; Latent Semantic Indexing; dimension reduction; domain ontology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-4515-8
Electronic_ISBN :
978-1-4244-2780-2
Type :
conf
DOI :
10.1109/NLPKE.2008.4906768
Filename :
4906768
Link To Document :
بازگشت