DocumentCode
3696648
Title
Automatic multilabel classification for Indonesian news articles
Author
Dyah Rahmawati;Masayu Leylia Khodra
Author_Institution
School of Electrical Engineering and Informatics, Institut Teknologi Bandun, Bandung, Indonesia
fYear
2015
Firstpage
1
Lastpage
6
Abstract
Problem transformation and algorithm adaptation are the two main approaches in machine learning to solve multilabel classification problem. The purpose of this paper is to investigate both approaches in multilabel classification for Indonesian news articles. Since this classification deals with a large number of features, we also employ some feature selection methods to reduce feature dimension. There are four factors as the focuses of this paper, i.e., feature weighting method, feature selection method, multilabel classification approach, and single-label classification algorithm. These factors will be combined to determine the best combination. The experiments show that the best performer for multilabel classification of Indonesian news articles is the combination of TF-IDF feature weighting method, Symmetrical Uncertainty feature selection method, Calibrated Label Ranking — which belongs to problem transformation approach —, and SVM algorithm. This best combination achieves F-measure of 85.13% in 10-fold cross-validation, but the F-measure decreases to 76.73% in testing because of OOV.
Keywords
"Classification algorithms","Correlation","Support vector machines","Uncertainty","Testing","Art","Adaptation models"
Publisher
ieee
Conference_Titel
Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 2015 2nd International Conference on
Print_ISBN
978-1-4673-8142-0
Type
conf
DOI
10.1109/ICAICTA.2015.7335382
Filename
7335382
Link To Document