• DocumentCode
    3230262
  • Title

    Missing value estimation for microarray data based on fuzzy C-means clustering

  • Author

    Luo, Jiawei ; TaoYang, TaoYang ; YanWang, YanWang

  • Author_Institution
    Sch. of Comput. & Commun., Hunan Univ., Changsha
  • fYear
    2005
  • fDate
    1-1 July 2005
  • Lastpage
    616
  • Abstract
    Microarray experiments can generate data sets with multiple missing expression values, normally due to various experimental problems. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. Effective missing value estimation methods are needed, therefore, to minimize the effect of incomplete data sets on analysis, and to increase the range of data sets to which these algorithms can be applied. In this paper, a new imputation method (FCMimpute) based on the fuzzy C-means clustering algorithm is proposed to estimate missing values in microarray data, which utilizes information in the cluster structures. This imputes the missing value by the attribute over all cluster centers obtained through fuzzy C-means clustering algorithm applicable to incomplete data. We compare the estimation accuracy of our method with the widely used KNNimpute and another SKNNimpute method on various microarray data sets with different percentage of missing entries. In our experiments, the proposed FCMimpute method shows better performance than other methods in terms of Root Means Square error
  • Keywords
    data analysis; fuzzy systems; genetics; pattern clustering; FCMimpute; KNNimpute; SKNNimpute; data analysis; estimation accuracy; fuzzy C-means clustering; gene array values; gene expression analysis; imputation method; microarray data; missing value estimation; root means square error; Algorithm design and analysis; Clustering algorithms; Data analysis; Filling; Fuzzy sets; Gene expression; Image resolution; Large-scale systems; Root mean square; Singular value decomposition; Microarray data; fuzzy C-means; missing value estimation; validity function.;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Computing in Asia-Pacific Region, 2005. Proceedings. Eighth International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7695-2486-9
  • Type

    conf

  • DOI
    10.1109/HPCASIA.2005.53
  • Filename
    1592330