DocumentCode :
3483389
Title :
An encoding technique based on word importance for the clustering of Web documents
Author :
Zakos, J. ; Verma, Brijesh
Author_Institution :
Sch. of Inf. Technol., Griffith Univ., Australia
Volume :
5
fYear :
2002
fDate :
18-22 Nov. 2002
Firstpage :
2207
Abstract :
We present a word encoding and clustering technique that groups Web documents based on the importance of the words that appear in the documents. We use a two level self-organizing map architecture to generate clusters of words and documents. We propose that by capturing word importance information of words, similar documents can be then clustered to assist in Web document retrieval. A Web document retrieval system is presented to demonstrate how this approach could. be integrated into Web search.
Keywords :
Internet; encoding; information retrieval; pattern clustering; search engines; self-organising feature maps; word processing; Web document clustering; Web document retrieval system; encoding technique; two level self-organizing map architecture; word encoding; word importance; word importance information; Encoding; Gold; Histograms; Information processing; Information retrieval; Information technology; Internet; Search engines; Self organizing feature maps; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
Print_ISBN :
981-04-7524-1
Type :
conf
DOI :
10.1109/ICONIP.2002.1201885
Filename :
1201885
Link To Document :
بازگشت