• DocumentCode
    3307292
  • Title

    A new feature selection method based on clustering

  • Author

    Huawen Liu ; Yuchang Mo ; Jiyi Wang ; Jianmin Zhao

  • Author_Institution
    Dept. of Comput. Sci., Zhejiang Normal Univ., Jinhua, China
  • Volume
    2
  • fYear
    2011
  • fDate
    26-28 July 2011
  • Firstpage
    965
  • Lastpage
    969
  • Abstract
    Feature selection is an effective technique to put the high dimension of data down, which is prevailing in many application domains, such as text categorization and bio-informatics, and can bring many advantages, such as improving efficiency and avoiding over-fitting, to learning algorithms. Currently, many efforts have been attempted in this field and various feature selection methods have been developed and proved to be very competitive. Unlike other selection methods, in this paper we propose a new method to select important features using a manner of feature clustering. The main character of our method is that it works like data clustering in an agglomerative way. In this method, each feature is considered as a data point clustered with between-cluster and within-cluster distances. As a result, the selected feature subset has minimal redundancy among its members and maximal relevance with the class labels. Our performance evaluations on seven benchmark datasets show that the classification performance achieved by our proposed method is better than other feature selection methods.
  • Keywords
    learning (artificial intelligence); pattern classification; pattern clustering; between-cluster distance; bioinformatics; data clustering; feature clustering; feature selection method; learning algorithms; text categorization; within-cluster distances; Accuracy; Clustering algorithms; Machine learning; Measurement; Mutual information; Pattern recognition; Redundancy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-61284-180-9
  • Type

    conf

  • DOI
    10.1109/FSKD.2011.6019687
  • Filename
    6019687