• DocumentCode
    3541055
  • Title

    A non-parametric Bayesian clustering for gene expression data

  • Author

    Wang, Liming ; Wang, Xiaodong

  • Author_Institution
    Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
  • fYear
    2012
  • fDate
    5-8 Aug. 2012
  • Firstpage
    556
  • Lastpage
    559
  • Abstract
    Clustering is an important data processing tool for interpreting microarray data and genomic network inference. In this paper, we propose a non-parametric Bayesian clustering algorithm based on the hierarchical Dirichlet processes (HDP). The proposed clustering algorithm captures the hierarchical features prevalent in biological data such as the gene express data by introducing a hierarchical structure in the model. We develop a Gibbs sampling algorithm based on the Chinese restaurant metaphor. We conduct experiments on the yeast galactose datasets and yeast cell cycle datasets by comparing our clustering results to the standard results. The proposed clustering algorithm is shown to outperform several popular clustering algorithms by revealing the underlying hierarchical structure of the data. The experiments also show that the proposed clustering algorithm provides more information and reduces the unnecessary clustering fragments than the clustering algorithm based on Dirichlet mixture model.
  • Keywords
    Bayes methods; data structures; genomics; lab-on-a-chip; pattern clustering; sampling methods; Chinese restaurant metaphor-based Gibbs sampling algorithm; Dirichlet mixture model-based clustering algorithm; HDP; biological data; data processing tool; gene expression data; hierarchical Dirichlet processes; hierarchical data structure; hierarchical features; hierarchical structure; microarray data interpretation; nonparametric Bayesian clustering; yeast cell cycle datasets; yeast galactose datasets; Bayesian methods; Biological system modeling; Clustering algorithms; Indexes; Inference algorithms; Signal processing algorithms; Dirichlet processes; Hierarchical Dirichlet processes; clustering; microarray data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Statistical Signal Processing Workshop (SSP), 2012 IEEE
  • Conference_Location
    Ann Arbor, MI
  • ISSN
    pending
  • Print_ISBN
    978-1-4673-0182-4
  • Electronic_ISBN
    pending
  • Type

    conf

  • DOI
    10.1109/SSP.2012.6319758
  • Filename
    6319758