Title :
A genetic algorithm for discretization of decision systems
Author_Institution :
Inst. of Artificial Intelligence, Zhejiang Univ., China
Abstract :
Discretization of attributes with real values is an important problem in data mining based on rough set. The discretization based on rough set has some particular characteristics. Consistency need to be satisfied and cuts set for discretization is expected to be as small as possible. Consistent and minimal discretization problem is NP-complete. A genetic algorithm for consistent and minimal discretization of decision system is proposed. In the genetic algorithm, chromosome is represented as a binary string, whose length is the same as the number of the candidate cuts. The fitness function is designed elaborately and two weight factors are introduced into the definition of fitness function to handle the consistency and minimum. Experiments show that the algorithm is better than the greedy method and the famous ChiMerge method. The algorithm can solve the consistent and minimal discretization of decision system preferably.
Keywords :
data mining; decision theory; genetic algorithms; rough set theory; ChiMerge method; NP complete problem; binary string; chromosome; consistent discretization problem; data mining; decision systems; fitness function; genetic algorithm; greedy method; minimal discretization problem; rough set theory; weight factor; Algorithm design and analysis; Artificial intelligence; Biological cells; Data mining; Genetic algorithms; Machine learning; Mathematics; Robustness; Set theory; Uncertainty;
Conference_Titel :
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN :
0-7803-8403-2
DOI :
10.1109/ICMLC.2004.1381977