• DocumentCode
    3601644
  • Title

    On Efficient Feature Ranking Methods for High-Throughput Data Analysis

  • Author

    Bo Liao ; Yan Jiang ; Wei Liang ; Lihong Peng ; Li Peng ; Hanyurwimfura, Damien ; Zejun Li ; Min Chen

  • Author_Institution
    Key Lab. for Embedded & Network Comput. of Hunan Province, Hunan Univ., Changsha, China
  • Volume
    12
  • Issue
    6
  • fYear
    2015
  • Firstpage
    1374
  • Lastpage
    1384
  • Abstract
    Efficient mining of high-throughput data has become one of the popular themes in the big data era. Existing biology-related feature ranking methods mainly focus on statistical and annotation information. In this study, two efficient feature ranking methods are presented. Multi-target regression and graph embedding are incorporated in an optimization framework, and feature ranking is achieved by introducing structured sparsity norm. Unlike existing methods, the presented methods have two advantages: (1) the feature subset simultaneously account for global margin information as well as locality manifold information. Consequently, both global and locality information are considered. (2) Features are selected by batch rather than individually in the algorithm framework. Thus, the interactions between features are considered and the optimal feature subset can be guaranteed. In addition, this study presents a theoretical justification. Empirical experiments demonstrate the effectiveness and efficiency of the two algorithms in comparison with some state-of-the-art feature ranking methods through a set of real-world gene expression data sets.
  • Keywords
    bioinformatics; cellular biophysics; data mining; genetics; graph theory; optimisation; regression analysis; annotation information; feature ranking methods; gene expression data sets; graph embedding; high-throughput data analysis; high-throughput data mining; locality manifold information; multi-target regression; optimization; statistical information; structured sparsity norm; Bioinformatics; Computational biology; Data mining; Information analysis; Regression analysis; ???2,1-norm; Feature ranking; Regression; convex optimization; manifold learning; microarray data analysis; microarray data analysis,; regression;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2015.2415790
  • Filename
    7065240