• DocumentCode
    2021508
  • Title

    Automatic Extraction of Data from 2-D Plots in Documents

  • Author

    Lu, Xiaonan ; Wang, James Z. ; Mitra, Prasenjit ; Giles, C. Lee

  • Author_Institution
    Pennsylvania State Univ., State College
  • Volume
    1
  • fYear
    2007
  • fDate
    23-26 Sept. 2007
  • Firstpage
    188
  • Lastpage
    192
  • Abstract
    Two-dimensional (2-D) plots in digital documents contain important information. Often, the results of scientific experiments and performance of businesses are summarized using plots. Although 2-D plots are easily understood by human users, current search engines rarely utilize the information contained in the plots to enhance the results returned in response to queries posed by end- users. We propose an automated algorithm for extracting information from line curves in 2-D plots. The extracted information can be stored in a database and indexed to answer end-user queries and enhance search results. We have collected 2-D plot images from a variety of resources and tested our extraction algorithms. Experimental evaluation has demonstrated that our method can produce results suitable for real world use.
  • Keywords
    database indexing; document image processing; image retrieval; image thinning; automatic data extraction; database indexing; digital document 2D plots; end-user queries; image thinning algorithm; information extraction algorithm; line curves; search engines; Data analysis; Data mining; Engineering drawings; Graphics; Histograms; Humans; Image databases; Search engines; Testing; Two dimensional displays;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
  • Conference_Location
    Parana
  • ISSN
    1520-5363
  • Print_ISBN
    978-0-7695-2822-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.2007.4378701
  • Filename
    4378701