DocumentCode :
1945372
Title :
Knowledge discovery method to accomplish English document classification
Author :
Ghada, Elmarhomy ; Atlam, Elsayed ; Hanafusa, Hiro ; Fuketa, Masao ; Morita, Kazuhiro ; Aoe, Jun-Ichi
Author_Institution :
Dept. of Inf. Sci. & Intelligent Syst., Tokushima Univ., Japan
fYear :
2005
fDate :
19-21 May 2005
Firstpage :
268
Abstract :
Although there is much research of text classification based on vector spaces using word information in the whole text, generally humans can recognize the field by finding the specific words. This paper describes what is field-associated term and how to discover field-associated terms, which exist in any text. In this paper, such words are called a field association (FA) word that can be directly related to the field classification. Five criteria of FA terms are defined for hierarchical fields. All of them are stored to field tree to make use of extraction of field-coherent passages for document classification. The presented approach is estimated by the simulation results of 140 fields text files of sports field and extended by 197 text field of civil engineering.
Keywords :
data mining; natural languages; text analysis; word processing; English document classification; field association word; field-associated term discovery; knowledge discovery; text classification; Civil engineering; Classification tree analysis; Data mining; Humans; Information science; Intelligent systems; Stability; Text categorization; Text recognition; Tree data structures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Active Media Technology, 2005. (AMT 2005). Proceedings of the 2005 International Conference on
Print_ISBN :
0-7803-9035-0
Type :
conf
DOI :
10.1109/AMT.2005.1505330
Filename :
1505330
Link To Document :
بازگشت