DocumentCode :
2283913
Title :
Two-Stage Model for Information Filtering
Author :
Zhou, Xujuan ; Li, Yuefeng ; Bruza, Peter ; Xu, Yue ; Lau, Raymond Y K
Author_Institution :
Fac. of Inf. Technol., Queensland Univ. of Technol., Brisbane, QLD
Volume :
3
fYear :
2008
fDate :
9-12 Dec. 2008
Firstpage :
685
Lastpage :
689
Abstract :
This thesis presents a novel two-stage model that integrates the theories and techniques from the fields of information retrieval/filtering (IR/IF)and the fields of machine learning and data mining to provide more precise document filtering and retrieval. The first stage is topic filtering. The topic filtering stage is intended to minimize information mismatch by filtering out the most likely irrelevant information based on term-based profiles. Thus, only a relatively small amount of potentially highly relevant documents remain for document ranking. The second stage of the presented method uses pattern mining approach. The objective of the second stage is to solve the problem of information overload. The most likely relevant documents were assigned higher ranks by exploiting patterns in the pattern taxonomy. The second stage is precision oriented. Since relatively small amount of documents are involved at this stage, computational cost is markedly reduced, at the same time, with significant improved results. The new two-stage information filtering model has been evaluated by extensive experiments. The tests were based on well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely Reuters Corpus Volume 1 (RCV1). The performance of the new model was compared with both of the term-based and data mining-based IF models. The results show that more effective and efficient information access has been achieved by combining the strength of information filtering and data mining method.
Keywords :
data mining; document handling; information filtering; learning (artificial intelligence); data mining; document filtering; document ranking; document retrieval; information filtering; information mismatch minimization; information overload problem; information retrieval; machine learning; pattern mining; pattern taxonomy; term-based profile; topic filtering stage; two-stage model; Data mining; Filtering theory; Information filtering; Information filters; Information retrieval; Information technology; Intelligent agent; Internet; Search engines; Software agents; data mining; information filtering; information retrieval; sequential pattern mining; user profile;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
Type :
conf
DOI :
10.1109/WIIAT.2008.390
Filename :
4740871
Link To Document :
بازگشت