• DocumentCode
    2327211
  • Title

    Finding a high coverage set of 5-biclusters with swarm intelligence

  • Author

    de França, Fabrício O. ; Von Zuben, Fernando J.

  • Author_Institution
    Dept. of Comput. Eng. & Ind. Autom. (DCA), Univ. of Campinas (Unicamp), Campinas, Brazil
  • fYear
    2010
  • fDate
    18-23 July 2010
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Biclustering is usually referred to as the process of finding subsets of rows and columns from a given dataset. Each subset is a bicluster and corresponds to a sub-matrix whose elements tend to present a high degree of coherence with each other. In order to find such structures, the δ-biclustering problem was formulated, being denoted as the problem of finding a set of biclusters limited by a maximum degree of coherence, measured by a mean-squared residue, while maximizing the bicluster total size. Additionally, it is expected a reduced overlap among the biclusters in the set, in other words, a minimization of the number of common elements shared by them. This also leads to a high coverage of the original dataset given the number of biclusters found. Most algorithms intended to find such biclusters focus only on the mean-squared residue and/or the bicluster size. This usually leads to a set of biclusters that do not fully cover the whole data and, as a consequence, shares a high overlap among them. This may generate redundant information on some portions of the dataset and lack of information on other portions. Also, some methods introduce noise into the dataset in order to promote a better coverage, but sometimes misleading the search. In this paper, a swarm-based approach, named SwarmBcluster, is created to effectively find biclusters without introducing noise and with the main objective of achieving maximum coverage. Experiments were performed considering two well-known datasets and a comparative analysis considering other approaches indicates that SwarmBcluster is capable of finding a set of biclusters with high coverage, while maintaining a high average volume and also obeying the coherence constraint imposed.
  • Keywords
    data mining; matrix algebra; mean square error methods; minimisation; pattern clustering; δ-biclustering problem; SwarmBcluster; biclusters; coverage set; mean-squared residue; minimization; submatrix; swarm intelligence; Ant colony optimization; Coherence; Data mining; Indexes; Noise; Particle swarm optimization; Search problems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation (CEC), 2010 IEEE Congress on
  • Conference_Location
    Barcelona
  • Print_ISBN
    978-1-4244-6909-3
  • Type

    conf

  • DOI
    10.1109/CEC.2010.5586116
  • Filename
    5586116