• DocumentCode
    1840977
  • Title

    On relations between Genes and metagenes obtained via gradient-based matrix factorization

  • Author

    Huang, Tian-Hsiang ; Nikulin, Vladimir ; McLachlan, Geoffrey J.

  • Author_Institution
    Dept. of Math., Univ. of Queensland, Brisbane, QLD, Australia
  • fYear
    2010
  • fDate
    13-15 July 2010
  • Firstpage
    17
  • Lastpage
    22
  • Abstract
    The high dimensionality of microarray data, the expressions of thousands of genes in a much smaller number of samples, presents challenges that affect the applicability of the analytical results. In principle, it would be better to describe the data in terms of a small number of metagenes, derived as a result of matrix factorization, which could reduce noise while still capturing the essential features of the data. Our system represents a two-step procedure. Firstly, using a gradient-based matrix factorization (GMF) proposed in our previous study, we reduce a given microarray to a few metagenes. Secondly, we demonstrate the sensitivity of the system using a linear support vector machine (SVM). We conducted experiments in this paper on three real datasets. The standard leave-one-out (LOO) scheme was employed in order to evaluate the quality of the system. The evaluation with LOO misclassification rates (LMR) demonstrates that metagenes acquired as an outcome of our method can capture the important biological features of the data. In addition, we considered links between our model and the gene ontology GO taking into account the pathway records of the Kyoto Encyclopedia of Genes and Genomes (KEGG) extracted from candidate gene sets according to absolute value of their correlations with the metagenes. This knowledge may be particularly useful in order to improve interpretability of the results presented to biologists.
  • Keywords
    biology computing; genetics; genomics; matrix decomposition; molecular biophysics; support vector machines; gene ontology; genes; genomes; gradient-based matrix factorization; matrix factorization; metagenes; microarray data; misclassification rates; Colon; Correlation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Complex Medical Engineering (CME), 2010 IEEE/ICME International Conference on
  • Conference_Location
    Gold Coast, QLD
  • Print_ISBN
    978-1-4244-6841-6
  • Type

    conf

  • DOI
    10.1109/ICCME.2010.5558880
  • Filename
    5558880