Author :
Gomes, H. ; de Castro Neto, Miguel ; Henriques, Rui
Author_Institution :
Inst. Super. de Estatistica e Gestao de Informacao, Univ. Nova de Lisboa, Lisbon, Portugal
Abstract :
In the last few years, due to the emergence of social networks, the interaction between customers and companies has experienced major changes. This change, like others, has advantages but also disadvantages. One of the major disadvantages which arose from this modification is the fact that, currently, organizations have lost control over what customers say about them, since they can easily publish their negative opinions and spread them rapidly. However, some organizations have quickly realized this situation could promote important competitive advantages, through the analysis of what customers say about them in different communication channels. Besides that, the increasing use of internet allowed that a lot of information is available online and an example of it is that, nowadays, the majority of newspapers make their publications daily available, on their websites, on the internet. Therefore, the data volume daily available on the internet grows exponentially and all of the information produced through this data might be important, if treated and used correctly. That is how the challenge of creating knowledge through this information in an automated way, emerges. Thus, the goal of this project is to build a model able to evaluate the polarity (positive, negative or neutral) of economic news headlines, available on RSS Feeds addresses. In order to do that, software SAS was used and, consequently its methodology, whose detailed description is also a goal. In this way, section I introduces the subject for a better contextualization. Section II presents the goals for the project which originated this paper, followed by the state of art in the section III. The section IV portrays the methodology to Knowledge Discovery in Text as well as the methodology used in the creation of Sentiment Analysis model. The section V refers the results achieved with the implementation of this project and, for last, the conclusions are presented in the section VI.
Keywords :
data mining; pattern classification; publishing; text analysis; RSS feeds; contextualization; economic news headlines; negative polarity; neutral polarity; news classification; polarity evaluation; positive polarity; sentiment analysis model; software SAS; text knowledge discovery; text mining; Discrete cosine transforms; Face; Feeds; Internet; Software; Synthetic aperture sonar; Text mining; Natural Language Processing; Sentiment Analysis; Text Mining;