• DocumentCode
    3695257
  • Title

    A hybrid approach to discover semantic hierarchical sections in scholarly documents

  • Author

    Suppawong Tuarob;Prasenjit Mitra;C. Lee Giles

  • Author_Institution
    Information and Communication Technology, Mahidol University, Thailand
  • fYear
    2015
  • Firstpage
    1081
  • Lastpage
    1085
  • Abstract
    Scholarly documents are usually composed of sections, each of which serves a different purpose by conveying specific context. The ability to automatically identify sections would allow us to understand the semantics of what is different in different sections of documents, such as what was in the introduction, methodologies used, experimental types, trends, etc. We propose a set of hybrid algorithms to 1) automatically identify section boundaries, 2) recognize standard sections, and 3) build a hierarchy of sections. Our algorithms achieve an F-measure of 92.38% in section boundary detection, 96% accuracy (average) on standard section recognition, and 95.51% in accuracy in the section positioning task.
  • Keywords
    "Support vector machines","Niobium","Radio frequency","Yttrium","Accuracy"
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDAR.2015.7333927
  • Filename
    7333927