DocumentCode
3756858
Title
Class Discovery via Bimodal Feature Selection in Unsupervised Settings
Author
Jessica Curtis;Mark Kon
Author_Institution
Dept. of Math. &
fYear
2015
Firstpage
699
Lastpage
702
Abstract
In machine learning there are numerous supervised techniques that extend naturally to analogous unsupervised methods, such as clustering. In this paper, we consider so-called rare-weak models, in which the number of important features is small (or rare) and the signal strength of each important feature is minimal (or weak). When classical clustering is applied crudely in "big data" scenarios, significant problems can arise, including long computational run times and significant clustering errors. One solution is to use feature selection (FS) to reduce dataset dimensionality before clustering. We introduce two novel unsupervised feature selection methods, one parametric and one nonparametric, based on what we call bimodal feature selection. These methods produce ranked lists of features based on their univariate multi-modality. Unlike previously developed univariate FS methods, which have typically been restricted to 2-cluster scenarios, our method has been adapted and tested to discriminate binary and higher level clusterings. The method is particularly advantageous in rare-weak settings, since reducing data dimensionality allows classical clustering methods to be applied computationally faster and with greater accuracy.
Keywords
"Clustering methods","Kernel","Clustering algorithms","Estimation","Standards","Electronic mail"
Publisher
ieee
Conference_Titel
Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on
Type
conf
DOI
10.1109/ICMLA.2015.206
Filename
7424401
Link To Document