DocumentCode :
3603667
Title :
Unsupervised Feature Selection with Controlled Redundancy (UFeSCoR)
Author :
Banerjee, Monami ; Pal, Nikhil R.
Author_Institution :
Electron. & Commun. Sci. Unit, Indian Stat. Inst., Kolkata, India
Volume :
27
Issue :
12
fYear :
2015
Firstpage :
3390
Lastpage :
3403
Abstract :
Features selected by a supervised/ unsupervised technique often include redundant or correlated features. While use of correlated features may result in an increase in the design and decision making cost, removing redundancy completely can make the system vulnerable to measurement errors. Most feature selection schemes do not account for redundancy at all, while a few supervised methods try to discard correlated features. We propose a novel unsupervised feature selection scheme (UFeSCoR), which not only discards irrelevant features, but also selects features with controlled redundancy. Here, the number of selected features can also be directed. Our algorithm optimizes an objective function, which tries to select a specified number of features, with a controlled level of redundancy, such that the topology of the original data set can be maintained in the reduced dimension. Here, we have used Sammon´s error as a measure of preservation of topology. We demonstrate the effectiveness of the algorithm in terms of choosing relevant features, controlling redundancy, and selecting a given number of features using several data sets. We make a comparative study with five unsupervised feature selection methods. Our results reveal that the proposed method can select useful features with controlled redundancy.
Keywords :
feature selection; redundancy; unsupervised learning; Sammon error; UFeSCoR scheme; data set topology; decision making cost; measurement errors; objective function; supervised feature selection technique; unsupervised feature selection with controlled redundancy; Feature extraction; Linear programming; Logic gates; Measurement uncertainty; Network topology; Redundancy; Unsupervised learning; Dimensionality reduction; Redundancy Control; Sammon???s error; Unsupervised Feature Selection; Unsupervised feature selection; dimensionality reduction; gradient descent technique; redundancy control;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2015.2455509
Filename :
7155531
Link To Document :
بازگشت