• DocumentCode
    344586
  • Title

    Text categorization with the concept of fuzzy set of informative keywords

  • Author

    Jo, Taeho C.

  • Author_Institution
    Samsung SDS, South Korea
  • Volume
    2
  • fYear
    1999
  • fDate
    22-25 Aug. 1999
  • Firstpage
    609
  • Abstract
    Text categorization is the procedure of assigning a category to a particular document among predefined categories. Informative keywords are the ones which reflect the contents of a document. A document includes informative keywords and non-informative keywords. Mainly non-informative keywords play the roles of grammatical functions in sentences; such keywords, what are called functional keywords, reflect its contents very little, so they should be removed in the process of document indexing. The discrimination between informative keywords and functional keywords is not crisp. In the process of document indexing, a document is represented as a set of informative keywords. In this paper, it is proposed that a document be represented into a fuzzy set of informative keywords, instead of a crisp set of informative keywords. The experiments of the categorization of news articles show that the proposed schemes of text categorization outperform the schemes with crisp sets.
  • Keywords
    category theory; data mining; fuzzy set theory; indexing; document indexing; functional keywords; fuzzy set theory; informative keywords; text categorization; Data mining; Fuzzy sets; Hardware; Indexing; Information analysis; Internet; Network synthesis; Pattern analysis; Text categorization; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems Conference Proceedings, 1999. FUZZ-IEEE '99. 1999 IEEE International
  • Conference_Location
    Seoul, South Korea
  • ISSN
    1098-7584
  • Print_ISBN
    0-7803-5406-0
  • Type

    conf

  • DOI
    10.1109/FUZZY.1999.793010
  • Filename
    793010