DocumentCode
2699414
Title
Building A Document Class Hierarchy for Obtaining More Proper Bibliographies from Web
Author
Wang, Daling ; Yu, Ge ; Hu, Minghan ; Bao, Yubin ; Zhang, Meng
Author_Institution
Sch. of Inf. Sci. & Eng., Northeastern Univ., Shenyang
fYear
2005
fDate
8-9 April 2005
Firstpage
214
Lastpage
219
Abstract
In order for researchers in scientific and technological fields to find more proper information resources on Web, an auxiliary search structure is proposed, which is a class hierarchy of documents built based on the keywords of the documents. To cover the contents of the document properly, the keywords are extracted by means of mining maximal sequential frequent phrases. In this paper, the concept of maximal sequential frequent phrase is defined, and the corresponding mining algorithm is designed and implemented. The experiments show that keywords extraction using maximal sequential frequent phrase has better F-measure than that of using traditional TFIDF weight. Moreover, compared with previous works, our extended class hierarchy tree represents a relationship hierarchy either between keywords themselves or between keywords and documents, by which the queries on different professional levels can be supported
Keywords
Internet; data mining; search engines; text analysis; TFIDF weight; World Wide Web; auxiliary search structure; bibliographies; document class hierarchy; document keywords; information resources; keyword extraction; maximal sequential frequent phrase mining; Algorithm design and analysis; Bibliographies; Books; Data mining; Information resources; Information science; Internet; Proposals; Search engines; Writing;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Information Retrieval and Integration, 2005. WIRI '05. Proceedings. International Workshop on Challenges in
Conference_Location
Tokyo
Print_ISBN
0-7695-2414-1
Type
conf
DOI
10.1109/WIRI.2005.13
Filename
1553016
Link To Document