• DocumentCode
    3696632
  • Title

    A new feature selection based on class dependency and feature dissimilarity

  • Author

    Niphat Claypo;Saichon Jaiyen

  • Author_Institution
    Department of Computer Science, Faculty of Science, King Mongkut´s Institute of Technology Ladkrabang, Thailand
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Feature selection method is an important task for data preprocessing in data mining. Before a classifier learns the training data, there are a lot of features in each data set that makes the learning process slower. It is not appropriated for big data analytics. This paper proposes feature selection method based on the class dependency and feature dissimilarity (CDFD) using mutual information and Euclidean distance. The mutual information is applied to determine the dependency between the feature and the class if the dataset contains discrete data. If the dataset contains continuous data, the correlation between the feature and the class is used instead. The Euclidean distance is used for reducing the duplicated features based on dissimilarity between features. The experiments are conducted on five datasets. From the experimental results, the propose feature selection method can reduce the number of features in the data set and reduce the classification error of classifiers. Furthermore, it can be applied to discrete and continuous data and it can help classifiers improving their classification accuracies and reducing the computational times for learning.
  • Keywords
    "Mutual information","Mathematical model","Training","Euclidean distance","Error analysis","Correlation","Genetic algorithms"
  • Publisher
    ieee
  • Conference_Titel
    Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 2015 2nd International Conference on
  • Print_ISBN
    978-1-4673-8142-0
  • Type

    conf

  • DOI
    10.1109/ICAICTA.2015.7335366
  • Filename
    7335366