Title :
Study on Semantic Representation of Web Information Based on Repeating Patterns
Author :
Gao, Kening ; Zhang, Bin ; Zhang, Yin ; Wei, Hongru ; Ma, Anxiang
Author_Institution :
Inst. of Comput. Applic. Technol., Northeastern Univ., Shenyang
Abstract :
The method that using repeating information, appeared in Web pages to represent the semantic meaning can be used to improve the correct rate of Web pages classification. This paper analyses and improves the traditional repeating patterns representation methods, and further proposes a new semantic representation of Web information based on repeating patterns. First, the repeating patterns are formal described and then the repeating patterns of Web information are extracted and the correlative matrix is built. At last, gamma-approximate matching algorithm is used for computing the weight of repeating patterns and categorize the Web pages. Experiment result shows that semantic representation of Web information based on repeating patterns is good at the extraction of Web pages´ topic characters, and this approach can also improve the accuracy of Web information classification.
Keywords :
Internet; classification; Web information classification; Web page categorization; Web page classification; approximate matching; correlative matrix; repeating patterns representation; semantic meaning; semantic representation; Algorithm design and analysis; Computer applications; Data mining; Fuzzy systems; Information analysis; Pattern analysis; Pattern matching; Performance analysis; Probability; Web pages;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Jinan Shandong
Print_ISBN :
978-0-7695-3305-6
DOI :
10.1109/FSKD.2008.121