DocumentCode
28331
Title
Extremely High-Dimensional Feature Selection via Feature Generating Samplings
Author
Shutao Li ; Dan Wei
Author_Institution
Coll. of Electr. & Inf. Eng., Hunan Univ., Changsha, China
Volume
44
Issue
6
fYear
2014
fDate
Jun-14
Firstpage
737
Lastpage
747
Abstract
To select informative features on extremely high-dimensional problems, in this paper, a sampling scheme is proposed to enhance the efficiency of recently developed feature generating machines (FGMs). Note that in FGMs O(m log r) time complexity should be taken to order the features by their scores; the entire computational cost of feature ordering will become unbearable when m is very large, for example, m > 1011, where m is the feature dimensionality and r is the size of the selected feature subset. To solve this problem, in this paper, we propose a feature generating sampling method, which can reduce this computational complexity to O(Gs log(G) + G(G + log(G))) while preserving the most informative features in a feature buffer, where Gs is the maximum number of nonzero features for each instance and G is the buffer size. Moreover, we show that our proposed sampling scheme can be deemed as the birth-death process based on random processes theory, which guarantees to include most of the informative features for feature selections. Empirical studies on real-world datasets show the effectiveness of the proposed sampling method.
Keywords
computational complexity; feature selection; random processes; sampling methods; birth-death process; computational complexity; computational cost; extremely high-dimensional feature selection; feature dimensionality; feature generating samplings; feature ordering; high-dimensional problems; informative feature selection; random processes theory; time complexity; Algorithm design and analysis; Analytical models; Complexity theory; Computational efficiency; Sampling methods; Training; Vectors; Extremely high dimensional problem; feature generating machine; feature selection; informative feature;
fLanguage
English
Journal_Title
Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
2168-2267
Type
jour
DOI
10.1109/TCYB.2013.2269765
Filename
6555840
Link To Document