• DocumentCode
    230696
  • Title

    The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective

  • Author

    Jian Wu ; Williams, Kresimir ; Khabsa, Madian ; Giles, C. Lee

  • Author_Institution
    Inf. Sci. & Technol., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2014
  • fDate
    22-25 Oct. 2014
  • Firstpage
    171
  • Lastpage
    176
  • Abstract
    CiteSeerX is a crawl-based digital library search engine providing free access to more than 4 million academic papers. Since metadata in the digital library is obtained through automatic extraction, it is inevitable that errors will occur. CiteSeerX offers a feature allowing registered users to correct paper metadata including titles, authors, abstracts, publication years, venues, etc. We claim that user corrections, as a form of crowd-collaboration, provide a useful and efficient way to improve metadata quality and the impact of the digital library. As evidence to support this claim, we investigate user corrections from the last 5 years and analyze: the nature of the corrections; the quality of the corrections; and the impact of the corrections on downloads.
  • Keywords
    digital libraries; groupware; meta data; search engines; CiteSeerX; crawl-based digital library search engine; crowd-collaboration; paper metadata correction; user corrections; Abstracts; Educational institutions; Google; History; Libraries; Manuals; Search engines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), 2014 International Conference on
  • Conference_Location
    Miami, FL
  • Type

    conf

  • Filename
    7014562