DocumentCode
85217
Title
Fuzzy–Rough Simultaneous Attribute Selection and Feature Extraction Algorithm
Author
Maji, Pradipta ; Garai, Partha
Author_Institution
Machine Intell. Unit, Indian Stat. Inst., Kolkata, India
Volume
43
Issue
4
fYear
2013
fDate
Aug. 2013
Firstpage
1166
Lastpage
1177
Abstract
Among the huge number of attributes or features present in real-life data sets, only a small fraction of them are effective to represent the data set accurately. Prior to analysis of the data set, selecting or extracting relevant and significant features is an important preprocessing step used for pattern recognition, data mining, and machine learning. In this regard, a novel dimensionality reduction method, based on fuzzy-rough sets, that simultaneously selects attributes and extracts features using the concept of feature significance is presented. The method is based on maximizing both the relevance and significance of the reduced feature set, whereby redundancy therein is removed. This paper also presents classical and neighborhood rough sets for computing the relevance and significance of the feature set and compares their performances with that of fuzzy-rough sets based on the predictive accuracy of nearest neighbor rule, support vector machine, and decision tree. An important finding is that the proposed dimensionality reduction method based on fuzzy-rough sets is shown to be more effective for generating a relevant and significant feature subset. The effectiveness of the proposed fuzzy-rough-set-based dimensionality reduction method, along with a comparison with existing attribute selection and feature extraction methods, is demonstrated on real-life data sets.
Keywords
data analysis; data mining; data structures; decision trees; feature extraction; fuzzy set theory; learning (artificial intelligence); rough set theory; support vector machines; classical rough sets; data mining; data set analysis; data set representation; decision tree; dimensionality reduction method; feature extraction algorithm; feature selection; feature significance concept; feature subset; fuzzy-rough sets; fuzzy-rough simultaneous attribute selection; machine learning; nearest neighbor rule; neighborhood rough sets; pattern recognition; support vector machine; Approximation methods; Complexity theory; Data mining; Feature extraction; Rough sets; Silicon; Uncertainty; Attribute selection; classification; feature extraction; pattern recognition; rough sets;
fLanguage
English
Journal_Title
Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
2168-2267
Type
jour
DOI
10.1109/TSMCB.2012.2225832
Filename
6374694
Link To Document