DocumentCode :
2327211
Title :
Finding a high coverage set of 5-biclusters with swarm intelligence
Author :
de França, Fabrício O. ; Von Zuben, Fernando J.
Author_Institution :
Dept. of Comput. Eng. & Ind. Autom. (DCA), Univ. of Campinas (Unicamp), Campinas, Brazil
fYear :
2010
fDate :
18-23 July 2010
Firstpage :
1
Lastpage :
8
Abstract :
Biclustering is usually referred to as the process of finding subsets of rows and columns from a given dataset. Each subset is a bicluster and corresponds to a sub-matrix whose elements tend to present a high degree of coherence with each other. In order to find such structures, the δ-biclustering problem was formulated, being denoted as the problem of finding a set of biclusters limited by a maximum degree of coherence, measured by a mean-squared residue, while maximizing the bicluster total size. Additionally, it is expected a reduced overlap among the biclusters in the set, in other words, a minimization of the number of common elements shared by them. This also leads to a high coverage of the original dataset given the number of biclusters found. Most algorithms intended to find such biclusters focus only on the mean-squared residue and/or the bicluster size. This usually leads to a set of biclusters that do not fully cover the whole data and, as a consequence, shares a high overlap among them. This may generate redundant information on some portions of the dataset and lack of information on other portions. Also, some methods introduce noise into the dataset in order to promote a better coverage, but sometimes misleading the search. In this paper, a swarm-based approach, named SwarmBcluster, is created to effectively find biclusters without introducing noise and with the main objective of achieving maximum coverage. Experiments were performed considering two well-known datasets and a comparative analysis considering other approaches indicates that SwarmBcluster is capable of finding a set of biclusters with high coverage, while maintaining a high average volume and also obeying the coherence constraint imposed.
Keywords :
data mining; matrix algebra; mean square error methods; minimisation; pattern clustering; δ-biclustering problem; SwarmBcluster; biclusters; coverage set; mean-squared residue; minimization; submatrix; swarm intelligence; Ant colony optimization; Coherence; Data mining; Indexes; Noise; Particle swarm optimization; Search problems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation (CEC), 2010 IEEE Congress on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6909-3
Type :
conf
DOI :
10.1109/CEC.2010.5586116
Filename :
5586116
Link To Document :
بازگشت