DocumentCode
3306674
Title
Clustering and classification of document structure-a machine learning approach
Author
Dengel, Andreas ; Dubiel, Frank
Author_Institution
German Res. Center for Artificial Intelligence, Kaiserslautern, Germany
Volume
2
fYear
1995
fDate
14-16 Aug 1995
Firstpage
587
Abstract
We describe a system which is capable of learning the presentation of document logical structures, exemplarily shown for business letters. Presenting a set of instances to the system, it clusters them into structural concepts and induces a concept hierarchy. This concept hierarchy is taken as a source for classifying future input. The paper introduces the different learning steps, describes how the resulting concept hierarchy is applied for logical labeling and reports on the results
Keywords
business data processing; classification; document handling; knowledge based systems; learning by example; technical presentation; business letters; concept hierarchy; document logical structure presentation; document structure classification; document structure clustering; learning by example; logical labeling; machine learning approach; Artificial intelligence; Classification tree analysis; Costs; Decision trees; Fuzzy logic; Information retrieval; Labeling; Logic testing; Machine learning; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location
Montreal, Que.
Print_ISBN
0-8186-7128-9
Type
conf
DOI
10.1109/ICDAR.1995.601965
Filename
601965
Link To Document