Title of article :
Clustering microarray data using model-based double K-means
Author/Authors :
Francesca Martella&Maurizio Vichi، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2012
Abstract :
The microarray technology allows the measurement of expression levels of thousands of genes
simultaneously. The dimension and complexity of gene expression data obtained by microarrays create
challenging data analysis and management problems ranging from the analysis of images produced by
microarray experiments to biological interpretation of results. Therefore, statistical and computational
approaches are beginning to assume a substantial position within the molecular biology area.We consider
the problem of simultaneously clustering genes and tissue samples (in general conditions) of a microarray
data set. This can be useful for revealing groups of genes involved in the same molecular process as well as
groups of conditions where this process takes place. The need of finding a subset of genes and tissue samples
defining a homogeneous block had led to the application of double clustering techniques on gene expression
data. Here, we focus on an extension of standard K-means to simultaneously cluster observations and
features of a data matrix, namely double K-means introduced byVichi (2000).We introduce this model in
a probabilistic framework and discuss the advantages of using this approach.We also develop a coordinate
ascent algorithm and test its performance via simulation studies and real data set. Finally, we validate the
results obtained on the real data set by building resampling confidence intervals for block centroids.
Keywords :
double K-means , coordinate ascent algorithm , Microarray data , model-based biclustering , stratified resamplingprocedure
Journal title :
JOURNAL OF APPLIED STATISTICS
Journal title :
JOURNAL OF APPLIED STATISTICS