• DocumentCode
    2767580
  • Title

    Aspect Guided Text Categorization with Unobserved Labels

  • Author

    Roth, Dan ; Tu, Yuancheng

  • Author_Institution
    Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
  • fYear
    2009
  • fDate
    6-9 Dec. 2009
  • Firstpage
    962
  • Lastpage
    967
  • Abstract
    This paper proposes a novel multiclass classification method and exhibits its advantage in the domain of text categorization with a large label space and, most importantly, when some of the labels were not observed in the training data. The key insight is the introduction of intermediate aspect variables that encode properties of the labels. Aspect variables serve as a joint representation for observed and unobserved labels. This way the classification problem can be viewed as a structure learning problem with natural constraints on assignments to the aspect variables. We solve the problem as a constrained optimization problem over multiple learners and show significant improvement in classifying short sentences into a large label space of categories, including previously unobserved categories.
  • Keywords
    classification; learning (artificial intelligence); text analysis; aspect guided text categorization; aspect variable; constrained optimization; multiclass classification; short sentence; structure learning; unobserved label; Books; Computer science; Conference management; Distributed computing; Engineering management; Meetings; Portals; Publishing; Software engineering; Text categorization; constrained optimization; multiclass classsification; structure learning; text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
  • Conference_Location
    Miami, FL
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-5242-2
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2009.129
  • Filename
    5360039