DocumentCode :
3470997
Title :
Extracting Structure of Web Site Based on Hyperlink Analysis
Author :
Li, Feng
Author_Institution :
Sch. of Bus. Adm., South China Univ. of Technol., Guangzhou
fYear :
2008
fDate :
12-14 Oct. 2008
Firstpage :
1
Lastpage :
4
Abstract :
Structure of a Web site usually reflects the implicit logical relationship among Web pages, and is widely applied to Web mining and Web information retrieval. However, it is difficult for machine to extract structure of a Web site automatically out of varied noise hyperlinks. This paper proposes an algorithm to extract the structure of a Web site automatically based on hyperlink analysis. The algorithm identifies and filters noise hyperlinks by patterns of Web pages these hyperlinks connected, instead of patterns of the hyperlinks. It promises better performances than previous approaches. The preliminary results show that the proposed algorithm has a great improvement on both precision and recall ratio.
Keywords :
Web sites; data mining; information analysis; information filters; information retrieval; Web information retrieval; Web mining; Web pages; Web site; hyperlink analysis; noise hyperlink filters; Algorithm design and analysis; Data mining; Humans; Information filtering; Information filters; Information retrieval; Partitioning algorithms; Pattern analysis; Web mining; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Wireless Communications, Networking and Mobile Computing, 2008. WiCOM '08. 4th International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-2107-7
Electronic_ISBN :
978-1-4244-2108-4
Type :
conf
DOI :
10.1109/WiCom.2008.2538
Filename :
4680727
Link To Document :
بازگشت