Title :
Efficient probability density balancing for supporting distributed knowledge discovery in large databases
Author :
Obradovic, Dragan ; Obradovic, Zoran
Author_Institution :
Corp. Technol., Inf. & Commun., Siemens AG, Munich, Germany
Abstract :
For the data received online from a source with an unknown probability distribution, the question addressed in this article is how to efficiently partition it to smaller representative subsets (databases) and how to organize these data subsets in order to minimize the computational cost of the later data analysis. The proposed linear-time, online problem decomposition method achieves these objectives through balancing probability distributions of the individual disjoint data subsets, each aimed at approximating the original data-source distribution. Consequently, computationally efficient statistical data analysis and neural network modelling on data subsets fitting into a computer central memory will produce results similar to these obtained through a global, computationally infeasible data analysis. In addition, the proposed decomposition scheme enables for an effective distributed data analysis on a network of workstations
Keywords :
data analysis; data mining; distributed processing; neural nets; probability; real-time systems; very large databases; data analysis; data mining; data-source distribution; distributed processing; knowledge discovery; large databases; neural network; probability density; statistical data analysis; Computer networks; Data analysis; Data mining; Decision making; Distributed computing; Distributed databases; Gaussian distribution; Neural networks; Probability distribution; Radio access networks;
Conference_Titel :
Neural Networks, 1999. IJCNN '99. International Joint Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-7803-5529-6
DOI :
10.1109/IJCNN.1999.833472