• DocumentCode
    3603667
  • Title

    Unsupervised Feature Selection with Controlled Redundancy (UFeSCoR)

  • Author

    Banerjee, Monami ; Pal, Nikhil R.

  • Author_Institution
    Electron. & Commun. Sci. Unit, Indian Stat. Inst., Kolkata, India
  • Volume
    27
  • Issue
    12
  • fYear
    2015
  • Firstpage
    3390
  • Lastpage
    3403
  • Abstract
    Features selected by a supervised/ unsupervised technique often include redundant or correlated features. While use of correlated features may result in an increase in the design and decision making cost, removing redundancy completely can make the system vulnerable to measurement errors. Most feature selection schemes do not account for redundancy at all, while a few supervised methods try to discard correlated features. We propose a novel unsupervised feature selection scheme (UFeSCoR), which not only discards irrelevant features, but also selects features with controlled redundancy. Here, the number of selected features can also be directed. Our algorithm optimizes an objective function, which tries to select a specified number of features, with a controlled level of redundancy, such that the topology of the original data set can be maintained in the reduced dimension. Here, we have used Sammon´s error as a measure of preservation of topology. We demonstrate the effectiveness of the algorithm in terms of choosing relevant features, controlling redundancy, and selecting a given number of features using several data sets. We make a comparative study with five unsupervised feature selection methods. Our results reveal that the proposed method can select useful features with controlled redundancy.
  • Keywords
    feature selection; redundancy; unsupervised learning; Sammon error; UFeSCoR scheme; data set topology; decision making cost; measurement errors; objective function; supervised feature selection technique; unsupervised feature selection with controlled redundancy; Feature extraction; Linear programming; Logic gates; Measurement uncertainty; Network topology; Redundancy; Unsupervised learning; Dimensionality reduction; Redundancy Control; Sammon???s error; Unsupervised Feature Selection; Unsupervised feature selection; dimensionality reduction; gradient descent technique; redundancy control;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2015.2455509
  • Filename
    7155531