• DocumentCode
    189191
  • Title

    Label Construction for Multi-label Feature Selection

  • Author

    Spolaor, Newton ; Monard, Maria Carolina ; Tsoumakas, Grigorios ; Huei Lee

  • Author_Institution
    Lab. of Comput. Intell., Univ. of Sao Paulo, Sao Carlos, Brazil
  • fYear
    2014
  • fDate
    18-22 Oct. 2014
  • Firstpage
    247
  • Lastpage
    252
  • Abstract
    Multi-label learning handles datasets where each instance is associated with multiple labels, which are often correlated. As other machine learning tasks, multi-label learning also suffers from the curse of dimensionality, which can be mitigated by dimensionality reduction tasks, such as feature selection. The standard approach for multi-label feature selection transforms the multi-label dataset into single-label datasets before using traditional feature selection algorithms. However, this approach often ignores label dependence. This work proposes an alternative method, LCFS, which constructs new labels based on relations between the original labels to augment the label set of the original dataset. Afterwards, the augmented dataset is submitted to the standard multi-label feature selection approach. Experiments using Information Gain as a measure to evaluate features were carried out in 10 multi-label benchmark datasets. For each dataset, the quality of the features selected was assessed by the quality of the classifiers built using the features selected by the standard approach in the original dataset, as well as in the dataset constructed by four LCFS settings. The results show that setting LCFS with simple strategies using pairs of labels gives rise to better classifiers than the ones built using the standard approach in the original dataset. Moreover, these good results are accomplished when a small number of features are selected.
  • Keywords
    data handling; feature extraction; learning (artificial intelligence); LCFS settings; augmented dataset; dimensionality reduction tasks; information gain; label construction; label dependence; machine learning tasks; multilabel benchmark datasets; multilabel learning; single label datasets; standard multilabel feature selection algorithm; Accuracy; Educational institutions; Entropy; Laboratories; Loss measurement; Standards; Transforms; Binary Relevance; Information Gain; feature ranking; filter feature selection; systematic review;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems (BRACIS), 2014 Brazilian Conference on
  • Conference_Location
    Sao Paulo
  • Type

    conf

  • DOI
    10.1109/BRACIS.2014.52
  • Filename
    6984838