• DocumentCode
    3756560
  • Title

    Automatic Extraction of Main Thesis Documents Fields Using Decision Trees

  • Author

    Alaa Mahmoud Sobhy;Yasser M. Kamal;Atef Zaki Ghalwash

  • Author_Institution
    Coll. of Comput. &
  • fYear
    2015
  • Firstpage
    203
  • Lastpage
    208
  • Abstract
    Thesis documents are underestimated even though they hold large sets of useful information -- as they include most of the research information -- , but since they are harder to obtain, researchers were lead to depend on research papers even though they have a size limitation and lack elaboration. A lot of time and effort are invested in research, so having a linkage among researchers based on their work would somehow facilitate solving the research problem process. A major step to tackle this goal is to structure thesis documents by extracting some fields such as title, author and abstract. This paper presents a way to structure a semi-structured thesis documents using decision trees in 4 different ways (Simple, Medium, Complex and using KNIME), they scored an overall accuracy of 99.2%.
  • Keywords
    "Decision trees","Feature extraction","Training","Data mining","Testing","Databases","Predictive models"
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Computational Intelligence (CSCI), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/CSCI.2015.164
  • Filename
    7424091