DocumentCode :
3254636
Title :
L1 vs. L2 Regularization in Text Classification when Learning from Labeled Features
Author :
Mazilu, Sînziana ; Iria, José
Author_Institution :
IBM Res., Zurich Res. Lab., Ruschlikon, Switzerland
Volume :
1
fYear :
2011
fDate :
18-21 Dec. 2011
Firstpage :
166
Lastpage :
171
Abstract :
In this paper we study the problem of building document classifiers using labeled features and unlabeled documents, where not all the features are helpful for the process of learning. This is an important setting, since building classifiers using labeled words has been recently shown to require considerably less human labeling effort than building classifiers using labeled documents. We propose the use of Generalized Expectation (GE) criteria combined with a L1 regularization term for learning from labeled features. This lets the feature labels guide model expectation constraints, while approaching feature selection from a regularization perspective. We show that GE criteria combined with L1 regularization consistently outperforms - up to 12% increase in accuracy - the best previously reported results in the literature under the same setting, obtained using L2 regularization. Furthermore, the results obtained with GE criteria and L1 regularizer are competitive to those obtained in the traditional instance-labeling setting, with the same labeling cost.
Keywords :
learning (artificial intelligence); pattern classification; text analysis; GE; L1 vs. L2 regularization; document classifiers; generalized expectation; instance labeling setting; labeled features; learning process; text classification; unlabeled documents; Accuracy; Buildings; Feature extraction; Labeling; Logistics; Training; Training data; generalized expectation criteria; regularization; semi-supervised learning; text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
Type :
conf
DOI :
10.1109/ICMLA.2011.85
Filename :
6146963
Link To Document :
بازگشت