مرکز منطقه ای اطلاع رساني علوم و فناوري - Extracting Structure of Web Site Based on Hyperlink Analysis

DocumentCode :

3470997

Title :

Extracting Structure of Web Site Based on Hyperlink Analysis

Author :

Li, Feng

Author_Institution :

Sch. of Bus. Adm., South China Univ. of Technol., Guangzhou

fYear :

2008

fDate :

12-14 Oct. 2008

Firstpage :

Lastpage :

Abstract :

Structure of a Web site usually reflects the implicit logical relationship among Web pages, and is widely applied to Web mining and Web information retrieval. However, it is difficult for machine to extract structure of a Web site automatically out of varied noise hyperlinks. This paper proposes an algorithm to extract the structure of a Web site automatically based on hyperlink analysis. The algorithm identifies and filters noise hyperlinks by patterns of Web pages these hyperlinks connected, instead of patterns of the hyperlinks. It promises better performances than previous approaches. The preliminary results show that the proposed algorithm has a great improvement on both precision and recall ratio.

Keywords :

Web sites; data mining; information analysis; information filters; information retrieval; Web information retrieval; Web mining; Web pages; Web site; hyperlink analysis; noise hyperlink filters; Algorithm design and analysis; Data mining; Humans; Information filtering; Information filters; Information retrieval; Partitioning algorithms; Pattern analysis; Web mining; Web pages;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Wireless Communications, Networking and Mobile Computing, 2008. WiCOM '08. 4th International Conference on

Conference_Location :

Dalian

Print_ISBN :

978-1-4244-2107-7

Electronic_ISBN :

978-1-4244-2108-4

Type :

conf

DOI :

10.1109/WiCom.2008.2538

Filename :

4680727

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3470997