DocumentCode
147856
Title
A framework for integrating bibliographical data of computer science publications
Author
Tien Do ; Dao Lam ; Tin Huynh
Author_Institution
Univ. of Inf. Technol. - Vietnam, Ho Chi Minh City, Vietnam
fYear
2014
fDate
27-29 April 2014
Firstpage
245
Lastpage
250
Abstract
In this paper, we propose a framework to integrate bibliographical data of computer science publications from heterogeneous digital libraries. The framework consists of three key components: publication collector, bibliographical parser and duplicated checker. In order to analyze efficiency of our framework in integrating data from heterogeneous sources, we conduct experiment with three different digital libraries: Microsoft Academic Search, CiteSeerX and DBLP. At this time, our integrated dataset contains 5.320.539 publications and 1.723.148 authors and their metadata. Our dataset increases quantity of rows and columns compared with the others. Thus, it could be published for other studies related to bibliographical data such as searching literature, ranking publications, identifying the research trend, mining the linking of articles.
Keywords
bibliographic systems; data integration; digital libraries; electronic publishing; meta data; CiteSeerX; DBLP; Microsoft Academic Search; article linking mining; bibliographical data integration; bibliographical parser; computer science publications; duplicated checker; framework efficiency analysis; heterogeneous digital libraries; heterogeneous sources; literature search; meta data; publication collector; publication ranking; research trend identification; Computer science; Crawlers; Data mining; Databases; IEEE Xplore; Libraries; Metasearch; Bibliographical Data; Data Integration; Digital Library; Focused Crawler; OAI-PMH;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing, Management and Telecommunications (ComManTel), 2014 International Conference on
Conference_Location
Da Nang
Print_ISBN
978-1-4799-2904-7
Type
conf
DOI
10.1109/ComManTel.2014.6825612
Filename
6825612
Link To Document