DocumentCode :
3644581
Title :
Improving web clustering through a new modeling for web documents
Author :
Ioan Agavriloaei;Adrian Alexandrescu;Mitică Craus
Author_Institution :
Faculty of Automatic Control and Computer Engineering, “
fYear :
2011
Firstpage :
1
Lastpage :
6
Abstract :
The constant and rapid growth of the Web complexity and the Web size generates new challenges regarding the approaches in efficient processing of Web searched results. Due to the dynamic Web content and the huge amount of information returned by search engines, it is necessary to find new methods and ways for better organizing and modelling the information spread on the Web. In this paper, we propose the structuring of the Web content as a hierarchical environment, taking into account the site content and structure, the HTML document structure and the term importance. Furthermore, we propose an effective partitional clustering algorithm for a Web site. The preliminary results prove the effectiveness of the new Web content representation and the accuracy of the Web clustering algorithm.
Keywords :
"Clustering algorithms","HTML","Vectors","Accuracy","Algorithm design and analysis","Internet","Partitioning algorithms"
Publisher :
ieee
Conference_Titel :
System Theory, Control, and Computing (ICSTCC), 2011 15th International Conference on
Print_ISBN :
978-1-4577-1173-2
Type :
conf
Filename :
6085702
Link To Document :
بازگشت