• DocumentCode
    1582162
  • Title

    Automatic table ground truth generation and a background-analysis-based table structure extraction method

  • Author

    Wang, Yalin ; Phillips, Ihsin T. ; Haralick, Robert

  • Author_Institution
    Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA
  • fYear
    2001
  • fDate
    6/23/1905 12:00:00 AM
  • Firstpage
    528
  • Lastpage
    532
  • Abstract
    We first describe an automatic table ground truth generation system which can efficiently generate a large amount of accurate table ground truth suitable for the development of table detection algorithms. Then a novel background analysis-based, coarse-to-fine table identification algorithm and an X-Y cut table decomposition algorithm are described. We discuss an experimental protocol to evaluate the table detection algorithms. For a total of 1,125 document pages having 518 table entities and a total of 10,941 cell entities, our table detection algorithm takes line, word segmentation results as input and obtains around 90% cell correct detection rates
  • Keywords
    document image processing; image segmentation; X-Y cut table decomposition algorithm; background analysis-based identification; document layout analysis; experimental results; line segmentation; table detection algorithms; table ground truth generation system; table structure extraction method; word segmentation; Clustering algorithms; Computer science; Data mining; Detection algorithms; Educational institutions; Image analysis; Image segmentation; Partitioning algorithms; Protocols; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
  • Conference_Location
    Seattle, WA
  • Print_ISBN
    0-7695-1263-1
  • Type

    conf

  • DOI
    10.1109/ICDAR.2001.953845
  • Filename
    953845