Title :
Feature selection via discretization
Author :
Liu, Huan ; Setiono, Rudy
Author_Institution :
Dept. of Inf. Syst. & Comput. Sci., Nat. Univ. of Singapore, Singapore
Abstract :
Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant and/or redundant attributes. Chi2 is a simple and general algorithm that uses the χ 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data. It achieves feature selection via discretization. It can handle mixed attributes, work with multiclass data, and remove irrelevant and redundant attributes
Keywords :
data handling; feature extraction; learning (artificial intelligence); pattern classification; Chi2; chi2 statistic; discretization; feature selection; general algorithm; inconsistencies; mixed attributes; multiclass data; numeric attributes; pattern classification; redundant attribute removal; redundant attributes; Accuracy; Classification algorithms; Computer science; Information systems; Merging; Notice of Violation; Pattern classification; Remuneration; Statistics; Training data;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on