DocumentCode :
2699405
Title :
A topic based indexing approach for searching in documents
Author :
Osuna-Ontiveros, Daniel ; Lopez-Arevalo, Ivan ; Sosa-Sosa, Victor
Author_Institution :
Inf. Technol. Lab., CINVESTAV-IPN, Tamaulipas, Mexico
fYear :
2011
fDate :
26-28 Oct. 2011
Firstpage :
1
Lastpage :
6
Abstract :
Nowadays, users of computers store a lot of text documents. This requires fast and precise searches over documents. The goal of Information Retrieval (IR) models is to provide users with those documents that will satisfy their information needs. The core of such models is the document representation used in the indexing of documents. Traditional IR models handle the frequency of query terms. The disadvantage of these models is that they exclusively consider terms in the query and ignore similar terms. This paper proposes a topic based indexing approach to represent topics associated to documents. Documents are modeled by using clustering algorithms based on natural language processing. As result of this proposal is a document-topic matrix representation denoting the importance of topics inside documents. In a similar way, each query over documents is converted into a vector of topics. Thus, a similarity measure can be applied over this vector and the matrix of documents to retrieve the most relevant documents.
Keywords :
document handling; indexing; information needs; matrix algebra; natural language processing; pattern clustering; query formulation; vectors; clustering algorithm; document indexing; document matrix; document querying; document representation; document searching; document-topic matrix representation; information needs; information retrieval model; natural language processing; similarity measure; topic based indexing approach; topics vector; Computational modeling; Indexing; Information retrieval; Mathematical model; Proposals; Semantics; Vectors; Language technologies for IR; Semantic search; Semi-structured information retrieval;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical Engineering Computing Science and Automatic Control (CCE), 2011 8th International Conference on
Conference_Location :
Merida City
Print_ISBN :
978-1-4577-1011-7
Type :
conf
DOI :
10.1109/ICEEE.2011.6106659
Filename :
6106659
Link To Document :
بازگشت