Title :
Fuzzy data mining: effect of fuzzy discretization
Author :
Ishibuchi, Hisao ; Yamamoto, Takashi ; Nakashima, Tomoharu
Author_Institution :
Dept. of Ind. Eng., Osaka Prefecture Univ., Japan
Abstract :
When we generate association rules, continuous attributes have to be discretized into intervals while our knowledge representation is not always based on such discretization. For example, we usually use some linguistic terms (e.g., young, middle age, and old) for dividing our ages into some fuzzy categories. We describe the extraction of linguistic association rules and examine the performance of extracted rules. First we modify the definitions of the two basic measures (i.e., confidence and support) of association rules for extracting linguistic association rules. The main difference between standard and linguistic association rules is the discretization of continuous attributes. We divide the domain interval of each attribute into some fuzzy regions (i.e., linguistic terms) when we extract linguistic association rules. Next, we compare fuzzy discretization with standard non-fuzzy discretization through computer simulations on a pattern classification problem with many continuous attributes. The classification performance of extracted rules on unseen test patterns is examined under various conditions. Simulation results show that linguistic association rules with rule weights have high generalization ability even when the domain of each continuous attribute is homogeneously partitioned
Keywords :
associative processing; computational linguistics; data mining; fuzzy set theory; knowledge based systems; knowledge representation; pattern classification; association rule generation; basic measures; computer simulations; continuous attribute; continuous attributes; discretization; domain interval; extracted rules; fuzzy categories; fuzzy data mining; fuzzy discretization; fuzzy regions; generalization ability; knowledge representation; linguistic association rules; linguistic terms; pattern classification problem; rule weights; standard nonfuzzy discretization; unseen test patterns; Association rules; Computational modeling; Computer simulation; Data mining; Fuzzy logic; Humans; Industrial engineering; Knowledge representation; Machine learning; Testing;
Conference_Titel :
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
0-7695-1119-8
DOI :
10.1109/ICDM.2001.989525