DocumentCode :
3696632
Title :
A new feature selection based on class dependency and feature dissimilarity
Author :
Niphat Claypo;Saichon Jaiyen
Author_Institution :
Department of Computer Science, Faculty of Science, King Mongkut´s Institute of Technology Ladkrabang, Thailand
fYear :
2015
Firstpage :
1
Lastpage :
6
Abstract :
Feature selection method is an important task for data preprocessing in data mining. Before a classifier learns the training data, there are a lot of features in each data set that makes the learning process slower. It is not appropriated for big data analytics. This paper proposes feature selection method based on the class dependency and feature dissimilarity (CDFD) using mutual information and Euclidean distance. The mutual information is applied to determine the dependency between the feature and the class if the dataset contains discrete data. If the dataset contains continuous data, the correlation between the feature and the class is used instead. The Euclidean distance is used for reducing the duplicated features based on dissimilarity between features. The experiments are conducted on five datasets. From the experimental results, the propose feature selection method can reduce the number of features in the data set and reduce the classification error of classifiers. Furthermore, it can be applied to discrete and continuous data and it can help classifiers improving their classification accuracies and reducing the computational times for learning.
Keywords :
"Mutual information","Mathematical model","Training","Euclidean distance","Error analysis","Correlation","Genetic algorithms"
Publisher :
ieee
Conference_Titel :
Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 2015 2nd International Conference on
Print_ISBN :
978-1-4673-8142-0
Type :
conf
DOI :
10.1109/ICAICTA.2015.7335366
Filename :
7335366
Link To Document :
بازگشت