• DocumentCode
    124247
  • Title

    On Complexity of Effective Data Granulation in Databases

  • Author

    Wroblewski, Jakub ; Kowalski, Matthieu

  • Author_Institution
    Infobright Inc., Warsaw, Poland
  • Volume
    2
  • fYear
    2014
  • fDate
    11-14 Aug. 2014
  • Firstpage
    358
  • Lastpage
    363
  • Abstract
    We present a problem of splitting objects from finite set into heterogenous groups of equal or almost equal cardinalities. The problem resembles the classic problem of data clustering, but additional constraint on groups size and global measure of clustering quality makes well known clustering algorithms hardly applicable. Such problem occurs - as we believe - in many important aspects in computer science. We will consider its context in database area and make "body of research" the Info bright RDBMS as we noticed the problem emerges there in some natural way. In the paper we define the problem and present some specified applications. The main contribution of the article is the proof of its NP-hardness.
  • Keywords
    computational complexity; pattern clustering; relational databases; Infobright RDBMS; Infobright technology; NP-hard problem; clustering quality; data clustering; data granulation complexity; finite set; group size; object splitting; Engines; Loading; Query processing; Sorting; Vectors; Analytic Databases; Granulatng; NP-hard Problem; Outliers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Warsaw
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2014.119
  • Filename
    6927646