DocumentCode
3603667
Title
Unsupervised Feature Selection with Controlled Redundancy (UFeSCoR)
Author
Banerjee, Monami ; Pal, Nikhil R.
Author_Institution
Electron. & Commun. Sci. Unit, Indian Stat. Inst., Kolkata, India
Volume
27
Issue
12
fYear
2015
Firstpage
3390
Lastpage
3403
Abstract
Features selected by a supervised/ unsupervised technique often include redundant or correlated features. While use of correlated features may result in an increase in the design and decision making cost, removing redundancy completely can make the system vulnerable to measurement errors. Most feature selection schemes do not account for redundancy at all, while a few supervised methods try to discard correlated features. We propose a novel unsupervised feature selection scheme (UFeSCoR), which not only discards irrelevant features, but also selects features with controlled redundancy. Here, the number of selected features can also be directed. Our algorithm optimizes an objective function, which tries to select a specified number of features, with a controlled level of redundancy, such that the topology of the original data set can be maintained in the reduced dimension. Here, we have used Sammon´s error as a measure of preservation of topology. We demonstrate the effectiveness of the algorithm in terms of choosing relevant features, controlling redundancy, and selecting a given number of features using several data sets. We make a comparative study with five unsupervised feature selection methods. Our results reveal that the proposed method can select useful features with controlled redundancy.
Keywords
feature selection; redundancy; unsupervised learning; Sammon error; UFeSCoR scheme; data set topology; decision making cost; measurement errors; objective function; supervised feature selection technique; unsupervised feature selection with controlled redundancy; Feature extraction; Linear programming; Logic gates; Measurement uncertainty; Network topology; Redundancy; Unsupervised learning; Dimensionality reduction; Redundancy Control; Sammon???s error; Unsupervised Feature Selection; Unsupervised feature selection; dimensionality reduction; gradient descent technique; redundancy control;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2015.2455509
Filename
7155531
Link To Document