• DocumentCode
    1051533
  • Title

    A New Feature Selection Scheme Using a Data Distribution Factor for Unsupervised Nominal Data

  • Author

    Chow, Tommy W S ; Wang, Piyang ; Ma, Eden W M

  • Author_Institution
    City Univ. of Hong Kong, Kowloon
  • Volume
    38
  • Issue
    2
  • fYear
    2008
  • fDate
    4/1/2008 12:00:00 AM
  • Firstpage
    499
  • Lastpage
    509
  • Abstract
    A new efficient unsupervised feature selection method is proposed to handle nominal data without data transformation. The proposed feature selection method introduces a new data distribution factor to select appropriate clusters. The proposed method combines the compactness and separation together with a newly introduced concept of singleton item. This new feature selection method considers all features globally. It is computationally inexpensive and able to deliver very promising results. Eight datasets from the University of California Irvine (UCI) machine learning repository and a high-dimensional cDNA dataset are used in this paper. The obtained results show that the proposed method is very efficient and able to deliver very reliable results.
  • Keywords
    feature extraction; pattern clustering; unsupervised learning; clustering algorithm; data distribution factor; unsupervised feature selection method; unsupervised nominal data; Clustering; feature ranking; unsupervised feature selection; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Decision Support Techniques; Information Storage and Retrieval; Models, Statistical; Pattern Recognition, Automated; Statistical Distributions;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4419
  • Type

    jour

  • DOI
    10.1109/TSMCB.2007.914707
  • Filename
    4443855