• DocumentCode
    147856
  • Title

    A framework for integrating bibliographical data of computer science publications

  • Author

    Tien Do ; Dao Lam ; Tin Huynh

  • Author_Institution
    Univ. of Inf. Technol. - Vietnam, Ho Chi Minh City, Vietnam
  • fYear
    2014
  • fDate
    27-29 April 2014
  • Firstpage
    245
  • Lastpage
    250
  • Abstract
    In this paper, we propose a framework to integrate bibliographical data of computer science publications from heterogeneous digital libraries. The framework consists of three key components: publication collector, bibliographical parser and duplicated checker. In order to analyze efficiency of our framework in integrating data from heterogeneous sources, we conduct experiment with three different digital libraries: Microsoft Academic Search, CiteSeerX and DBLP. At this time, our integrated dataset contains 5.320.539 publications and 1.723.148 authors and their metadata. Our dataset increases quantity of rows and columns compared with the others. Thus, it could be published for other studies related to bibliographical data such as searching literature, ranking publications, identifying the research trend, mining the linking of articles.
  • Keywords
    bibliographic systems; data integration; digital libraries; electronic publishing; meta data; CiteSeerX; DBLP; Microsoft Academic Search; article linking mining; bibliographical data integration; bibliographical parser; computer science publications; duplicated checker; framework efficiency analysis; heterogeneous digital libraries; heterogeneous sources; literature search; meta data; publication collector; publication ranking; research trend identification; Computer science; Crawlers; Data mining; Databases; IEEE Xplore; Libraries; Metasearch; Bibliographical Data; Data Integration; Digital Library; Focused Crawler; OAI-PMH;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing, Management and Telecommunications (ComManTel), 2014 International Conference on
  • Conference_Location
    Da Nang
  • Print_ISBN
    978-1-4799-2904-7
  • Type

    conf

  • DOI
    10.1109/ComManTel.2014.6825612
  • Filename
    6825612