• DocumentCode
    1536429
  • Title

    Selecting Attributes for Sentiment Classification Using Feature Relation Networks

  • Author

    Abbasi, Ahmed ; France, Stephen ; Zhang, Zhu ; Chen, Hsinchun

  • Author_Institution
    Sheldon B. Lubar Sch. of Bus., Univ. of Wisconsin - Milwaukee, Milwaukee, WI, USA
  • Volume
    23
  • Issue
    3
  • fYear
    2011
  • fDate
    3/1/2011 12:00:00 AM
  • Firstpage
    447
  • Lastpage
    462
  • Abstract
    A major concern when incorporating large sets of diverse n-gram features for sentiment classification is the presence of noisy, irrelevant, and redundant attributes. These concerns can often make it difficult to harness the augmented discriminatory potential of extended feature sets. We propose a rule-based multivariate text feature selection method called Feature Relation Network (FRN) that considers semantic information and also leverages the syntactic relationships between n-gram features. FRN is intended to efficiently enable the inclusion of extended sets of heterogeneous n-gram features for enhanced sentiment classification. Experiments were conducted on three online review testbeds in comparison with methods used in prior sentiment classification research. FRN outperformed the comparison univariate, multivariate, and hybrid feature selection methods; it was able to select attributes resulting in significantly better classification accuracy irrespective of the feature subset sizes. Furthermore, by incorporating syntactic information about n-gram relations, FRN is able to select features in a more computationally efficient manner than many multivariate and hybrid techniques.
  • Keywords
    Internet; pattern classification; FRN; attributes selection; feature relation networks; n-gram features; semantic information; sentiment classification; Natural language processing; affective computing.; machine learning; subspace selection; text mining;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.110
  • Filename
    5510238