DocumentCode :
2341819
Title :
Automatically Organize Web Text Resources with Frequent Term Tree
Author :
Wang, Xiaofeng ; Shi, Zhongzhi
Author_Institution :
Key Lab. of Intell. Inf. Process., Grad. Univ. of Chinese Acad. of Sci., Beijing, China
Volume :
1
fYear :
2009
fDate :
11-14 Oct. 2009
Firstpage :
330
Lastpage :
335
Abstract :
With the expansion of the Web, automatically organizing large scale text resources, e.g. Web pages, becomes very important. Many Web sites, like Google and Yahoo, use hierarchical classification trees to organize text resources in Web. User can easily find the text resources that meet their requirements by navigating these hierarchical classification trees. Typically, the text resources in Web are manually assigned to the nodes of the hierarchical classification tree. This limits the hierarchical classification tree to organize large scale text resources. In this paper, we propose a Frequent Term Tree to improve the ability of hierarchical classification tree in organizing large scale text resources in Web. Different from the Fp-tree which is utilized to efficiently discover frequent patterns, the frequent term tree is used to organize resources with frequent pattern based classification. The frequent term tree can accurately assign text resources to each node of classification tree and improve the ability in organizing resources with the incremental classified text resources. The evaluation of the frequent term tree demonstrates that frequent term tree can effectively and efficiently organize text resources.
Keywords :
Internet; Web sites; pattern classification; tree data structures; Web pages; Web sites; Web text resources; frequent pattern based classification; frequent term tree; hierarchical classification trees; Classification tree analysis; Computers; Humans; Information processing; Information technology; Large-scale systems; Navigation; Organizing; Web pages; Web sites; adaptive; aprior; associative classification; frequent term;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Technology, 2009. CIT '09. Ninth IEEE International Conference on
Conference_Location :
Xiamen
Print_ISBN :
978-0-7695-3836-5
Type :
conf
DOI :
10.1109/CIT.2009.81
Filename :
5328061
Link To Document :
بازگشت