DocumentCode :
3597161
Title :
A new method for construction filed association terms using co-occurrence words and declinable words information
Author :
Atlam, El-Sayed ; Fuketa, M. ; Kashiji, S. ; Nakata, H. ; Aoe, Jun-Ichi
Author_Institution :
Dept. of Inf. Sci. & Intelligent Syst., Tokushima Univ., Japan
Volume :
4
fYear :
2002
Abstract :
Readers can know the subject of many document fields by reading only some specific words called field association (FA) terms. It is very important to construct these FA terms to decide correctly the document fields from few word information in part of the file. The field can be decided efficiently if the number of these FA terms is many and the frequency rate is high. If the number of level 1 (words that directly connect to terminal fields) FA words is limited, old methods cannot determine the documents filed easily and fast, specially when there is a small number of corpus documents. This paper proposes a new method for deciding FA terms using the weight of co-occurrence words and declinable words which are related to the narrow association category with eliminating FA terms´ ambiguity. Moreover, efficient FA terms are difficult to be extracted only by the information of the frequency of them. This paper proposes a new efficient method using new co-occurrence word weighting which makes precision and recall higher than the case of degree of frequency.
Keywords :
classification; information retrieval; natural languages; text analysis; vocabulary; association category; classification; co-occurrence words; corpus documents; declinable words; document field subject; filed association terms; information retrieval; keywords; precision; recall; word weighting; Data mining; Frequency; Humans; Information retrieval; Information science; Intelligent systems; Research and development;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2002 IEEE International Conference on
ISSN :
1062-922X
Print_ISBN :
0-7803-7437-1
Type :
conf
DOI :
10.1109/ICSMC.2002.1173247
Filename :
1173247
Link To Document :
بازگشت