DocumentCode
3601644
Title
On Efficient Feature Ranking Methods for High-Throughput Data Analysis
Author
Bo Liao ; Yan Jiang ; Wei Liang ; Lihong Peng ; Li Peng ; Hanyurwimfura, Damien ; Zejun Li ; Min Chen
Author_Institution
Key Lab. for Embedded & Network Comput. of Hunan Province, Hunan Univ., Changsha, China
Volume
12
Issue
6
fYear
2015
Firstpage
1374
Lastpage
1384
Abstract
Efficient mining of high-throughput data has become one of the popular themes in the big data era. Existing biology-related feature ranking methods mainly focus on statistical and annotation information. In this study, two efficient feature ranking methods are presented. Multi-target regression and graph embedding are incorporated in an optimization framework, and feature ranking is achieved by introducing structured sparsity norm. Unlike existing methods, the presented methods have two advantages: (1) the feature subset simultaneously account for global margin information as well as locality manifold information. Consequently, both global and locality information are considered. (2) Features are selected by batch rather than individually in the algorithm framework. Thus, the interactions between features are considered and the optimal feature subset can be guaranteed. In addition, this study presents a theoretical justification. Empirical experiments demonstrate the effectiveness and efficiency of the two algorithms in comparison with some state-of-the-art feature ranking methods through a set of real-world gene expression data sets.
Keywords
bioinformatics; cellular biophysics; data mining; genetics; graph theory; optimisation; regression analysis; annotation information; feature ranking methods; gene expression data sets; graph embedding; high-throughput data analysis; high-throughput data mining; locality manifold information; multi-target regression; optimization; statistical information; structured sparsity norm; Bioinformatics; Computational biology; Data mining; Information analysis; Regression analysis; ???2,1-norm; Feature ranking; Regression; convex optimization; manifold learning; microarray data analysis; microarray data analysis,; regression;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2015.2415790
Filename
7065240
Link To Document