• DocumentCode
    1987418
  • Title

    A method for tight clustering: with application to microarray

  • Author

    Tseng, George C. ; Wong, Wing H.

  • Author_Institution
    Dept. of Biostat., Pittsburgh Univ., PA, USA
  • fYear
    2003
  • fDate
    11-14 Aug. 2003
  • Firstpage
    396
  • Lastpage
    397
  • Abstract
    In this paper we propose a method for clustering that produces tight and stable clusters without forcing all points into clusters. Many existing clustering algorithms have been applied in microarray data to search for gene clusters with similar expression patterns. However, none has provided a way to deal with an essential feature of array data: many genes are expressed sporadically and do not belong to any of the significant biological functions (clusters) of interest. In fact, most current algorithms aim to assign all genes into clusters. For many biological studies, however, we are mainly interested in the most informative, tight and stable clusters with sizes of, say, 20-60 genes for farther investigation. Tight Clustering has been developed specifically to address this problem. The tightest and most stable clusters are identified in a sequential manner through an analysis of the tendency of genes to be grouped together under repeated resampling. We validated this method in the expression profiles of the Drosophila life cycle. The result is shown to better serve biological needs in microarray analysis.
  • Keywords
    arrays; biology computing; genetic algorithms; genetics; pattern clustering; unsupervised learning; zoology; Drosophila life cycle; biological functions; clustering algorithms; expression patterns; gene clusters; microarrays; repeated resampling; tight clustering; Biological processes; Clustering algorithms; Data analysis; Diseases; Information filtering; Information filters; Iterative algorithms; Statistics; Training data; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
  • Print_ISBN
    0-7695-2000-6
  • Type

    conf

  • DOI
    10.1109/CSB.2003.1227343
  • Filename
    1227343