• DocumentCode
    2775532
  • Title

    A New Minimally Supervised Learning Method for Semantic Term Classification - Experimental Results on Classifying Ratable Aspects Discussed in Customer Reviews

  • Author

    Nguyen, Thao Pham Thanh ; Hayashi, Takahiro ; Onai, Rikio ; Nishioka, Yuhei ; Takenaka, Takamasa ; Mori, Marco

  • Author_Institution
    Univ. of Electro-Commun., Chofu, Japan
  • fYear
    2009
  • fDate
    6-6 Dec. 2009
  • Firstpage
    43
  • Lastpage
    50
  • Abstract
    We present Bautext, a new minimally supervised approach for automatically extracting ratable aspects from customer reviews and classifying them to some previously defined categories. Bautext requires a small amount of seed words as supervised data and uses a bootstrapping mechanism o progressively collect new member for each category. Learning new category members and the category-specific terms for each category at the same time is the unique and featured classification mechanism of Bautext. Category-specific terms are terms that play important roles for properly extracting new category members. Furthermore, we proposed to use an additional trash category to filter non-purpose aspects, thus led to a significant improvement in precision score but could constrain the trade-off in decreasing recall score. Experimental results, conducted on a Japanese hotel review dataset, showed that Bautext outperforms the alternative techniques in all terms of precision, recall score and significantly in running time. And in the further comparison to Adaboost (as the state-of-the-art machine learning technique for semantic term classification task), we found that Adaboost require about 50% training data to deliver a similar performance as Bautext does with less than ten selective seed words for each category.
  • Keywords
    learning (artificial intelligence); pattern classification; Adaboost; Bautext; Japanese hotel review dataset; bootstrapping mechanism; both automatic extention; customer reviews; machine learning technique; minimally supervised learning method; ratable aspects extraction; seed words; semantic term classification task; supervised data; trash category; Conferences; Data mining; Digital cameras; Displays; Filters; Frequency; Machine learning; Measurement standards; Supervised learning; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on
  • Conference_Location
    Miami, FL
  • Print_ISBN
    978-1-4244-5384-9
  • Electronic_ISBN
    978-0-7695-3902-7
  • Type

    conf

  • DOI
    10.1109/ICDMW.2009.58
  • Filename
    5360527