• DocumentCode
    3079181
  • Title

    Using web resources for support of online-browsing of research papers

  • Author

    Ohta, Manabu ; Hachiki, Toshihiro ; Takasu, Atsuhiro

  • Author_Institution
    Okayama Univ., Okayama, Japan
  • fYear
    2009
  • fDate
    10-12 Aug. 2009
  • Firstpage
    348
  • Lastpage
    353
  • Abstract
    With more appropriate linkage of digital libraries to Web resources, online-browsing of research papers would be much comfortable since many digital libraries of research papers are online and accessible from the Web. This paper proposes a browsing support system for reading research papers online with the use of OCRed text of scanned academic articles. Our digital library stores scanned document images of research papers; hence, their OCRed texts can be cheaply obtained. The proposed system extracts technical terms from the OCRed text, searches the Web for the best explanatory descriptions of the terms, and gives the terms the links to the retrieved Web pages. This paper also describes an experimental exploration of several aspects of the proposed system, including the extraction accuracy of technical terms and the precision of retrieved Web pages.
  • Keywords
    Internet; digital libraries; document image processing; image retrieval; online front-ends; optical character recognition; text analysis; OCR; OCRed text research paper; Web page; Web resource; digital library; image retrieval; online-browsing support system; optical character recognition; scanned academic article; scanned document image; system extract technical term; Books; Data mining; Databases; Dictionaries; Internet; Optical character recognition software; Software libraries; Web pages; Wikipedia; XML; Browsing support; OCR; Research papers; Term extraction; Web resources;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse & Integration, 2009. IRI '09. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    978-1-4244-4114-3
  • Electronic_ISBN
    978-1-4244-4116-7
  • Type

    conf

  • DOI
    10.1109/IRI.2009.5211577
  • Filename
    5211577