Title :
Automatic Extraction of Main Thesis Documents Fields Using Decision Trees
Author :
Alaa Mahmoud Sobhy;Yasser M. Kamal;Atef Zaki Ghalwash
Author_Institution :
Coll. of Comput. &
Abstract :
Thesis documents are underestimated even though they hold large sets of useful information -- as they include most of the research information -- , but since they are harder to obtain, researchers were lead to depend on research papers even though they have a size limitation and lack elaboration. A lot of time and effort are invested in research, so having a linkage among researchers based on their work would somehow facilitate solving the research problem process. A major step to tackle this goal is to structure thesis documents by extracting some fields such as title, author and abstract. This paper presents a way to structure a semi-structured thesis documents using decision trees in 4 different ways (Simple, Medium, Complex and using KNIME), they scored an overall accuracy of 99.2%.
Keywords :
"Decision trees","Feature extraction","Training","Data mining","Testing","Databases","Predictive models"
Conference_Titel :
Computational Science and Computational Intelligence (CSCI), 2015 International Conference on
DOI :
10.1109/CSCI.2015.164