DocumentCode :
430495
Title :
Similarity score for information filtering thresholds
Author :
Lai, Jun ; Soh, Ben
Author_Institution :
Dept. of Comput. Sci. & Comput. Eng., Latrobe Univ., Melbourne, Vic., Australia
Volume :
1
fYear :
2004
fDate :
26-29 Oct. 2004
Firstpage :
216
Abstract :
The rapid growth of on-line information has led to the development of many techniques for information filtering. The tremendous growth in the amount of information available and the number of visitors to Web sites in recent years poses some key challenges for information filtering and retrieval. Web visitors not only expect high quality and relevant information, but also wish that the information be presented in as efficient a way as possible. The traditional filtering methods, however, only consider the relevant values of document. These conventional methods fail to consider the efficiency of document retrieval. In this paper, we propose a new algorithm to calculate an index called document similarity score based on elements of the document. Using the index, document profile is derived. Any documents with the similarity score above a given threshold are clustered. Using these pre-clustered documents, information filtering and retrieval can be made more efficient. Experimental results clearly show our proposed method tremendously improves the efficiency of information filtering and retrieval.
Keywords :
Web sites; information filtering; search engines; Web sites; document retrieval efficiency; document similarity score; information filtering thresholds; information retrieval; on-line information; pre-clustered documents; Books; Clustering algorithms; Conference proceedings; Crawlers; Electronic mail; Information filtering; Information filters; Information retrieval; Search engines; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications and Information Technology, 2004. ISCIT 2004. IEEE International Symposium on
Print_ISBN :
0-7803-8593-4
Type :
conf
DOI :
10.1109/ISCIT.2004.1412482
Filename :
1412482
Link To Document :
بازگشت