Title :
Implementation of genetic network programming and knapsack problem for record clustering on distributed database
Author :
Wedashwara, Wirarama ; Mabu, Shingo ; Obayashi, Masanao ; Kuremoto, Takashi
Author_Institution :
Grad. Sch. of Sci. & Eng., Yamaguchi Univ., Yamaguchi, Japan
Abstract :
This research involves implementation of genetic network programming (GNP) and knapsack problem (KP) to solve record clustering on distributed databases. The objective is to distribute big data to certain sites with the limited amount of capacities by considering the similarity of distributed data in each site. GNP is used to extract rules from big data by considering characteristics (value ranges) of each attribute in a dataset. KP is used to distribute rules to each site by considering similarity (value) and data amount (weight) related to each rule to match the site capacities.
Keywords :
Big Data; combinatorial mathematics; distributed databases; genetic algorithms; knapsack problems; pattern clustering; Big Data distribution; GNP; KP; Knapsack problem; attribute characteristics; attribute value ranges; combinational optimization problem; data amount weight; distributed data similarity value; distributed databases; genetic network programming; record clustering; rule extraction; site capacity matching; Data mining; Distributed databases; Economic indicators; Genetics; Optimization; Programming; Database Clustering; Genetic Network Programming; Knapsack Problem; Record Clustering;
Conference_Titel :
SICE Annual Conference (SICE), 2014 Proceedings of the
Conference_Location :
Sapporo
DOI :
10.1109/SICE.2014.6935234