• DocumentCode
    1933306
  • Title

    Clustering of High-Dimensional Gene Expression Data with Feature Filtering Methods and Diffusion Maps

  • Author

    Xu, Rui ; Damelin, Steven ; Nadler, Boaz ; Wunsch, Donald C.

  • Author_Institution
    Appl. Comput. Intell. Lab. Dept. of Electr. & Comput. Eng., Missouri Univ. of Sci. & Technol. Rolla, Rolla, MO
  • Volume
    1
  • fYear
    2008
  • fDate
    27-30 May 2008
  • Firstpage
    245
  • Lastpage
    249
  • Abstract
    The importance of gene expression data in cancer diagnosis and treatment by now has been widely recognized by cancer researchers in recent years. However, one of the major challenges in the computational analysis of such data is the curse of dimensionality, due to the overwhelming number of measures of gene expression levels versus the small number of samples. Here, we use a two-step method to reduce the dimension of gene expression data. At first, we extract a subset of genes based on the statistical characteristics of their corresponding gene expression measurements. For further dimensionality reduction, we then apply diffusion maps, which interpret the eigenfunctions of Markov matrices as a system of coordinates on the original data set in order to obtain efficient representation of data geometric descriptions, to the reduced data. A neural network clustering theory, Fuzzy ART, is applied to the resulting data to generate clusters of cancer samples. Experimental results on the small round blue-cell tumor (SRBCT) data set, compared with other widely-used clustering algorithms, demonstrate the effectiveness of our proposed method in addressing multidimensional gene expression data.
  • Keywords
    Markov processes; cancer; eigenvalues and eigenfunctions; fuzzy systems; genetics; medical computing; patient diagnosis; pattern clustering; tumours; Fuzzy ART; Markov matrices; SRBCT; cancer diagnosis; computational analysis; diffusion maps; eigenfunctions; feature filtering methods; high-dimensional gene expression data clustering; multidimensional gene expression data; network clustering theory; small round blue-cell tumor data set; Cancer; Data analysis; Data mining; Eigenvalues and eigenfunctions; Filtering; Fuzzy neural networks; Gene expression; Neoplasms; Neural networks; Subspace constraints;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on
  • Conference_Location
    Sanya
  • Print_ISBN
    978-0-7695-3118-2
  • Type

    conf

  • DOI
    10.1109/BMEI.2008.256
  • Filename
    4548670