DocumentCode :
2149319
Title :
Functional-Based Table Category Identification in Digital Library
Author :
Kim, Seongchan ; Liu, Ying
Author_Institution :
Dept. of Knowledge Service Eng., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
1364
Lastpage :
1368
Abstract :
Better understanding the document logical components is crucial to many applications, e.g., document classification or data integration. As the development of digital libraries, more people realize the importance of the scientific tables, which contain valuable information concisely. Although tons of previous table works focus on table data extraction, few concrete works on understanding and utilizing the extracted table data exist. Based on a large-scaled quantitative study on scientific papers, we believe that identifying the original purpose of the table authors can improve the table data comprehension and facilitate the table data reusability. In this paper, scientific document tables are classified into three topical categories: background, system/method, and experimental, and two functional categories: commentary and comparison. We apply machine learning based methods to implement the table classification task. Our results demonstrate that the proposed features are effective in the classification performance and our proposed method outperforms the rule-based baseline significantly.
Keywords :
classification; digital libraries; learning (artificial intelligence); data integration; digital library; document classification; document logical component; functional-based table category identification; machine learning; scientific document table; table classification task; table data comprehension; table data reusability; Data mining; Feature extraction; Instruments; Libraries; Portable document format; Search engines; Support vector machines; content analysis; document table; function-based classification; table category;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.274
Filename :
6065533
Link To Document :
بازگشت