DocumentCode
344586
Title
Text categorization with the concept of fuzzy set of informative keywords
Author
Jo, Taeho C.
Author_Institution
Samsung SDS, South Korea
Volume
2
fYear
1999
fDate
22-25 Aug. 1999
Firstpage
609
Abstract
Text categorization is the procedure of assigning a category to a particular document among predefined categories. Informative keywords are the ones which reflect the contents of a document. A document includes informative keywords and non-informative keywords. Mainly non-informative keywords play the roles of grammatical functions in sentences; such keywords, what are called functional keywords, reflect its contents very little, so they should be removed in the process of document indexing. The discrimination between informative keywords and functional keywords is not crisp. In the process of document indexing, a document is represented as a set of informative keywords. In this paper, it is proposed that a document be represented into a fuzzy set of informative keywords, instead of a crisp set of informative keywords. The experiments of the categorization of news articles show that the proposed schemes of text categorization outperform the schemes with crisp sets.
Keywords
category theory; data mining; fuzzy set theory; indexing; document indexing; functional keywords; fuzzy set theory; informative keywords; text categorization; Data mining; Fuzzy sets; Hardware; Indexing; Information analysis; Internet; Network synthesis; Pattern analysis; Text categorization; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems Conference Proceedings, 1999. FUZZ-IEEE '99. 1999 IEEE International
Conference_Location
Seoul, South Korea
ISSN
1098-7584
Print_ISBN
0-7803-5406-0
Type
conf
DOI
10.1109/FUZZY.1999.793010
Filename
793010
Link To Document