• DocumentCode
    593483
  • Title

    Improved T-Cluster based scheme for combination gene scale expression data

  • Author

    Vengatesan, K. ; Selvarajan, S.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Muthayammal Eng. Coll., Namakkal, India
  • fYear
    2012
  • fDate
    21-22 Dec. 2012
  • Firstpage
    131
  • Lastpage
    136
  • Abstract
    Clustering is an unsupervised learning technique in that there is no explicit demarcation of data as training and test data. Clustering aims to group related records by measuring similarities among the attribute. Major phase of clustering techniques is similarity measurement and it is based on different factors and parameters. The improved Nonnegative Matrix Factorization (NMF) based TCLUST (T-Clustering) algorithm is EM principle (Expectation Maximization) based algorithm, intended to search for approximate solutions. The EM algorithm is the efficient method of obtaining a solution to the mixture likelihood problem. Genes with a common function are often hypothesized to have correlated expression levels across different conditions. NMF clustering is introduced to find a small number of Meta genes, each defined as a positive linear combination of the genes in the expression data. The proposed clustering algorithm is applied to a genome scale gene expression dataset to enrichment analysis and to discover highly significant biological clusters.
  • Keywords
    biology computing; data structures; expectation-maximisation algorithm; genetics; genomics; matrix decomposition; pattern clustering; unsupervised learning; EM principle based algorithm; NMF clustering; T-clustering algorithm; biological clusters discovery; clustering techniques; combination gene scale expression data; enrichment analysis; expectation maximization based algorithm; genome scale gene expression dataset; improved T-cluster based scheme; meta genes; mixture likelihood problem; nonnegative matrix factorization based TCLUST algorithm; similarity measurement; test data; training data; unsupervised learning technique; Algorithm design and analysis; Clustering algorithms; Correlation coefficient; Euclidean distance; Gene expression; Vectors; EM Algorithms; Non Negative Matrix Factorization; TCLUST; Tanimoto Correlation Coefficient; Translation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Radar, Communication and Computing (ICRCC), 2012 International Conference on
  • Conference_Location
    Tiruvannamalai
  • Print_ISBN
    978-1-4673-2756-5
  • Type

    conf

  • DOI
    10.1109/ICRCC.2012.6450562
  • Filename
    6450562