• DocumentCode
    3696648
  • Title

    Automatic multilabel classification for Indonesian news articles

  • Author

    Dyah Rahmawati;Masayu Leylia Khodra

  • Author_Institution
    School of Electrical Engineering and Informatics, Institut Teknologi Bandun, Bandung, Indonesia
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Problem transformation and algorithm adaptation are the two main approaches in machine learning to solve multilabel classification problem. The purpose of this paper is to investigate both approaches in multilabel classification for Indonesian news articles. Since this classification deals with a large number of features, we also employ some feature selection methods to reduce feature dimension. There are four factors as the focuses of this paper, i.e., feature weighting method, feature selection method, multilabel classification approach, and single-label classification algorithm. These factors will be combined to determine the best combination. The experiments show that the best performer for multilabel classification of Indonesian news articles is the combination of TF-IDF feature weighting method, Symmetrical Uncertainty feature selection method, Calibrated Label Ranking — which belongs to problem transformation approach —, and SVM algorithm. This best combination achieves F-measure of 85.13% in 10-fold cross-validation, but the F-measure decreases to 76.73% in testing because of OOV.
  • Keywords
    "Classification algorithms","Correlation","Support vector machines","Uncertainty","Testing","Art","Adaptation models"
  • Publisher
    ieee
  • Conference_Titel
    Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 2015 2nd International Conference on
  • Print_ISBN
    978-1-4673-8142-0
  • Type

    conf

  • DOI
    10.1109/ICAICTA.2015.7335382
  • Filename
    7335382