• DocumentCode
    3220158
  • Title

    Sample selection of microarray data using rough-fuzzy based approach

  • Author

    Paul, Amit ; Sil, Jaya

  • Author_Institution
    Comput. Sci. & Eng. Dept., Gurunanak Inst. of Technol., Sodpur, India
  • fYear
    2009
  • fDate
    9-11 Dec. 2009
  • Firstpage
    379
  • Lastpage
    384
  • Abstract
    Though DNA microarray technology simultaneously measures the expression levels of thousands of genes, only a few underlying gene features may account for significant data variation in gene classification problems. Selection of features from huge data set is difficult and so dimension reduction of gene expression data set is essential in order to determining important features, which play key role in predicting an outcome. Rough set theory (RST) has been used recently for dimension reduction of data, however, the existing methods are inadequate to finding minimal reduct. The paper proposes a RST based technique, applied on gene expression data for dimension reduction by obtaining single reduct in one pass. The gene expression data are discretized using linguistic terms with proper semantics and represented by fuzzy sets. The discretized values are calculated using Gaussian membership function with varied mean and standard deviation in order to eliminate the ambiguity between different linguistic terms. The genes are classified using linguistic decision attribute values based on the frequency of gene expression data. Discritization and classification of gene expression data are performed simultaneously, which significantly reduces time complexity of the procedure. Thus, the proposed framework selects the most significant samples for gene classification, resulting dimension reduction. The Proposed method produces output, which exhibits no variation with experimental microarray gene information unlike other existing methods.
  • Keywords
    DNA; Gaussian processes; biology computing; fuzzy set theory; pattern classification; rough set theory; DNA microarray technology; Gaussian membership function; fuzzy sets; gene expression data; linguistic decision attribute values; microarray data selection; rough set theory; Biomedical measurements; Clustering algorithms; Computer science; DNA; Data analysis; Data engineering; Gene expression; Genetics; Proteins; Set theory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Nature & Biologically Inspired Computing, 2009. NaBIC 2009. World Congress on
  • Conference_Location
    Coimbatore
  • Print_ISBN
    978-1-4244-5053-4
  • Type

    conf

  • DOI
    10.1109/NABIC.2009.5393852
  • Filename
    5393852