Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ. of Sci. & Technol., Taipei, Taiwan
Abstract :
Microblogging (Twitter or Facebook) has become a very popular communication tool among Internet users in recent years. Information is generated and managed through either computer or mobile devices by one person and is consumed by many other persons, with most of this user-generated content being textual information. As there are a lot of raw data of people posting real time messages about their opinions on a variety of topics in daily life, it is a worthwhile research endeavor to collect and analyze these data, which may be useful for users or managers to make informed decisions, for example. However this problem is challenging because a micro-blog post is usually very short and colloquial, and traditional opinion mining algorithms do not work well in such type of text. Therefore, in this paper, we propose a new system architecture that can automatically analyze the sentiments of these messages. We combine this system with manually annotated data from Twitter, one of the most popular microblogging platforms, for the task of sentiment analysis. In this system, machines can learn how to automatically extract the set of messages which contain opinions, filter out nonopinion messages and determine their sentiment directions (i.e. positive, negative). Experimental results verify the effectiveness of our system on sentiment analysis in real microblogging applications.
Keywords :
Internet; cognition; content management; data mining; decision making; electronic messaging; information filtering; social networking (online); Internet; Twitter; automatic message extraction; data analysis; data collection; decision making; manual data annotation; message posting; message sentiment analysis; microblogging; mobile device; nonopinion message filtering; opinion mining; sentiment analysis; social media data; textual information; user generated content; Accuracy; Dictionaries; Motion pictures; Text categorization; Training; Training data; Twitter; Microblogging; Opinion Mining; Sentiment analysis;