• DocumentCode
    1932047
  • Title

    Multi-labeled document classification using semi-supervived mixture model of Watson distributions on document manifold

  • Author

    Nguyen Kim Anh ; Ngo Van Linh ; Nguyen Khac Toi ; Nguyen The Tam

  • Author_Institution
    Sch. of Inf. & Commun. Technol., Hanoi Univ. of Sci. & Technol., Hanoi, Vietnam
  • fYear
    2013
  • fDate
    15-18 Dec. 2013
  • Firstpage
    123
  • Lastpage
    128
  • Abstract
    Classification of multilabel documents is essential to information retrieval and text mining. Most of existing approaches to multilabel text classification do not pay attention to relationship between class labels and input documents and also rely on labeled data all the time for classification. In fact, unlabeled data is readily available whereas generation of labeled data is expensive and error prone as it needs human annotation. In this paper, we propose a novel multilabel document classification approach based on semi-supervised mixture model of Watson distributions on document manifold which explicitly considers the manifold structure of document space to exploit efficiently both labeled and unlabeled data for classification. Our proposed approach models all labels within a dataset simultaneously, which lends itself well to the task of considering the relationship between these labels. The experimental results show that proposed method outperforms the state-of-the-art methods applying to multilabeled text classification.
  • Keywords
    mixture models; pattern classification; statistical distributions; text analysis; Watson distributions; document manifold; labeled data; multilabeled document classification; multilabeled text classification; semisupervived mixture model; unlabeled data; Approximation methods; Art; Data models; Education; Manifolds; Support vector machines; Vectors; Laplacian Regularization; Mixture Models; Probabilistic Graphical Models; Semi-supervised Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing and Pattern Recognition (SoCPaR), 2013 International Conference of
  • Conference_Location
    Hanoi
  • Print_ISBN
    978-1-4799-3399-0
  • Type

    conf

  • DOI
    10.1109/SOCPAR.2013.7054113
  • Filename
    7054113