Title :
On Complexity of Effective Data Granulation in Databases
Author :
Wroblewski, Jakub ; Kowalski, Matthieu
Author_Institution :
Infobright Inc., Warsaw, Poland
Abstract :
We present a problem of splitting objects from finite set into heterogenous groups of equal or almost equal cardinalities. The problem resembles the classic problem of data clustering, but additional constraint on groups size and global measure of clustering quality makes well known clustering algorithms hardly applicable. Such problem occurs - as we believe - in many important aspects in computer science. We will consider its context in database area and make "body of research" the Info bright RDBMS as we noticed the problem emerges there in some natural way. In the paper we define the problem and present some specified applications. The main contribution of the article is the proof of its NP-hardness.
Keywords :
computational complexity; pattern clustering; relational databases; Infobright RDBMS; Infobright technology; NP-hard problem; clustering quality; data clustering; data granulation complexity; finite set; group size; object splitting; Engines; Loading; Query processing; Sorting; Vectors; Analytic Databases; Granulatng; NP-hard Problem; Outliers;
Conference_Titel :
Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Warsaw
DOI :
10.1109/WI-IAT.2014.119