• DocumentCode
    3756858
  • Title

    Class Discovery via Bimodal Feature Selection in Unsupervised Settings

  • Author

    Jessica Curtis;Mark Kon

  • Author_Institution
    Dept. of Math. &
  • fYear
    2015
  • Firstpage
    699
  • Lastpage
    702
  • Abstract
    In machine learning there are numerous supervised techniques that extend naturally to analogous unsupervised methods, such as clustering. In this paper, we consider so-called rare-weak models, in which the number of important features is small (or rare) and the signal strength of each important feature is minimal (or weak). When classical clustering is applied crudely in "big data" scenarios, significant problems can arise, including long computational run times and significant clustering errors. One solution is to use feature selection (FS) to reduce dataset dimensionality before clustering. We introduce two novel unsupervised feature selection methods, one parametric and one nonparametric, based on what we call bimodal feature selection. These methods produce ranked lists of features based on their univariate multi-modality. Unlike previously developed univariate FS methods, which have typically been restricted to 2-cluster scenarios, our method has been adapted and tested to discriminate binary and higher level clusterings. The method is particularly advantageous in rare-weak settings, since reducing data dimensionality allows classical clustering methods to be applied computationally faster and with greater accuracy.
  • Keywords
    "Clustering methods","Kernel","Clustering algorithms","Estimation","Standards","Electronic mail"
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICMLA.2015.206
  • Filename
    7424401