Title :
Using Chinese part-of-speech patterns for sentiment phrase identification and opinion extraction in user generated reviews
Author :
Peng, Ting-Chun ; Shih, Chia-Chun
Author_Institution :
Inst. for Inf. Ind., Taipei, Taiwan
Abstract :
Accelerated growth of the Internet has enabled users worldwide to share their feelings and experiences. User-generated content (UGC) websites are the most abundant sources of user reviews. Accurately identifying sentiment phrases is essential to understand the expressed opinions in user reviews. To achieve this, part-of-speech (POS) patterns of phrases are useful. However, previous studies for Chinese opinion extraction only translate English POS patterns directly into Chinese for this task without considering the feasibility. Therefore, this work proposes a Chinese opinion extraction method that exploits the observed Sinica Treebank POS patterns for sentiment phrase identification. Sinica Treebank is a widely representative POS corpus for Chinese. The results of preliminary experiments indicate that the proposed method is highly effective in extracting opinions from Chinese UGC reviews.
Keywords :
Internet; Web sites; language translation; natural language processing; text analysis; Internet; Sinica Treebank POS pattern; chinese part-of-speech patterns; opinion extraction; sentiment phrase identification; user-generated content Website; Accuracy; Classification algorithms; Discussion forums; Machine learning; Speech; Tagging; User-generated content;
Conference_Titel :
Digital Information Management (ICDIM), 2010 Fifth International Conference on
Conference_Location :
Thunder Bay, ON
Print_ISBN :
978-1-4244-7572-8
DOI :
10.1109/ICDIM.2010.5664631