• DocumentCode
    3409307
  • Title

    A new approach to clustering biological data using message passing

  • Author

    Geng, Huimin ; Bastola, Dhundy ; Ali, Hesham

  • Author_Institution
    Dept. of Comput. Sci., Nebraska Univ., Omaha, NE, USA
  • fYear
    2004
  • fDate
    16-19 Aug. 2004
  • Firstpage
    493
  • Lastpage
    494
  • Abstract
    Clustering algorithms are widely used in bioinformatics to classify data, as in the analysis of gene expression and in the building of phylogenetic trees. Biological data often describe parallel and spontaneous processes. To capture these features, we propose a new clustering algorithm that employs the concept of message passing. Message passing clustering (MPC) allows data objects to communicate with each other and produces clusters in parallel, thereby making the clustering process intrinsic. We have proved that MPC shares similarity with hierarchical clustering (HC) but offers significantly improved performance because it takes into account both local and global structure. We analyzed 35 sets of simulated dynamic gene expression data, achieving a 95% hit rate in which 639 genes out of total 674 genes were correctly clustered. We have also applied MPC to a real data set to build a phylogenetic tree from aligned mycobacterium sequences. The results show higher classification accuracies as compared to traditional clustering methods such as HC.
  • Keywords
    biology computing; genetics; message passing; microorganisms; parallel processing; pattern clustering; aligned mycobacterium sequences; bioinformatics; biological data clustering; data classification; hierarchical clustering; message passing; parallel processes; phylogenetic trees; simulated dynamic gene expression data; Algorithm design and analysis; Bioinformatics; Clustering algorithms; Clustering methods; Computer science; Couplings; Gene expression; Message passing; Pathology; Phylogeny;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332472
  • Filename
    1332472