DocumentCode :
730513
Title :
Learning interpretable classification rules using sequential rowsampling
Author :
Dash, Sanjeeb ; Malioutov, Dmitry M. ; Varshney, Kush R.
Author_Institution :
IBM Res., Yorktown Heights, NY, USA
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
3337
Lastpage :
3341
Abstract :
In our previous work we have presented an approach to learn interpretable classification rules using a Boolean compressed sensing formulation. Our approach uses a linear programming (LP) relaxation and allows us to find interpretable (sparse) classification rules that achieve good generalization accuracy. However, the resulting LP representation for problems with either a large number of samples or large number of continuous features tends to become challenging for off-the-shelf LP solvers. We have explored a screening approach which allows us to dramatically reduce the number of active features without sacrificing optimality. In this work we explore reducing the number of samples in a sequential setting where we can certify reaching a near-optimal solution while only solving the LP on a small fraction of the available data points. In a batch setting this approach can dramatically reduce the computational complexity of the rule-learning LP formulation. In an online setting we derive stochastic upper and lower bounds on the the LP objective for unseen samples. This allows early stopping when we detect that the classifier will not change significantly with additional samples. The upper bounds are related to the learning curve literature in machine learning, and our lower bounds appear not to have been explored. Finally, we discuss a quick approach to compute the complete regularization path balancing rule interpretability versus accuracy.
Keywords :
Boolean functions; compressed sensing; computational complexity; learning (artificial intelligence); linear programming; signal classification; signal sampling; stochastic processes; Boolean compressed sensing formulation; LP relaxation; LP representation; LP solvers; computational complexity; learning curve literature; learning interpretable classification rules; linear programming; machine learning; regularization path balancing rule; rule-learning LP formulation; sequential row sampling; sequential setting; sparse classification rules; stochastic lower bounds; stochastic upper bounds; Compressed sensing; Computational modeling; IP networks; Sensors; Stochastic processes; Training; Upper bound; Linear programming duality; row sampling; rule learning; sparse signal approximation; supervised classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178589
Filename :
7178589
Link To Document :
بازگشت