• DocumentCode
    18102
  • Title

    CLOpinionMiner: Opinion Target Extraction in a Cross-Language Scenario

  • Author

    Xinjie Zhou ; Xiaojun Wan ; Jianguo Xiao

  • Author_Institution
    MOE Key Lab. of Comput. Linguistics, Peking Univ., Beijing, China
  • Volume
    23
  • Issue
    4
  • fYear
    2015
  • fDate
    Apr-15
  • Firstpage
    619
  • Lastpage
    630
  • Abstract
    Opinion target extraction is a subtask of opinion mining which is very useful in many applications. The problem has usually been solved by training a sequence labeler on manually labeled data. However, the labeled training datasets are imbalanced in different languages, and the lack of labeled corpus in a language limits the research progress on opinion target extraction in this language. In order to address the above problem, we propose a novel system called CLOpinionMiner which investigates leveraging the rich labeled data in a source language for opinion target extraction in a different target language. In this study, we focus on English-to-Chinese cross-language opinion target extraction. Based on the English dataset, our method produces two Chinese training datasets with different features. Two labeling models for Chinese opinion target extraction are trained based on Conditional Random Fields (CRF). After that, we use a monolingual co-training algorithm to improve the performance of both models by leveraging the enormous unlabeled Chinese review texts on the web. Experimental results show the effectiveness of our proposed approach.
  • Keywords
    data mining; language translation; linguistics; natural language processing; random processes; text analysis; CLOpinionMiner; CRF; Chinese review texts; Chinese training datasets; English dataset; English-to-Chinese cross-language opinion target extraction; conditional random fields; cross-language scenario; labeled corpus; labeled data; labeled training datasets; labeling models; monolingual co-training algorithm; opinion mining; sequence labeler training; Cameras; Data mining; Data models; Feature extraction; Information retrieval; Labeling; Training; Cross-language information extraction; opinion mining; opinion target extraction;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2392381
  • Filename
    7009977