Title :
Mining association rules from databases with continuous attributes using genetic network programming
Author :
Taboada, Karla ; Gonzales, Eloy ; Shimada, Kaoru ; Mabu, Shingo ; Hirasawa, Kotaro ; Hu, Jinglu
Author_Institution :
Production & Syst. Waseda Univ., Fukuoka
Abstract :
Most association rule mining algorithms make use of discretization algorithms for handling continuous attributes. Discretization is a process of transforming a continuous attribute value into a finite number of intervals and assigning each interval to a discrete numerical value. However, by means of methods of discretization, it is difficult to get highest attribute interdependency and at the same time to get lowest number of intervals. In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific and statistical databases. We propose a method using a new graph-based evolutionary algorithm named "genetic network programming (GNP)" that can deal with continues values directly, that is, without using any discretization method as a preprocessing step. GNP represents its individuals using graph structures and evolve them in order to find a solution; this feature contributes to creating quite compact programs and implicitly memorizing past action sequences. In the proposed method using GNP, the significance of the extracted association rule is measured by the use of the chi-squared test and only important association rules are stored in a pool all together through generations. Results of experiments conducted on a real life database suggest that the proposed method provides an effective technique for handling continuous attributes.
Keywords :
data mining; database management systems; genetic algorithms; graph theory; association rule mining; chi-squared test; continuous attribute handling; continuous valued attribute; databases; discrete numerical value; discretization algorithm; genetic network programming; graph structure; graph-based evolutionary algorithm; Association rules; Data mining; Databases; Decision support systems; Economic indicators; Evolutionary computation; Genetic programming; Probability; Statistical analysis; Testing;
Conference_Titel :
Evolutionary Computation, 2007. CEC 2007. IEEE Congress on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-1339-3
Electronic_ISBN :
978-1-4244-1340-9
DOI :
10.1109/CEC.2007.4424622