Title :
An improved sentiment analysis of online movie reviews based on clustering for box-office prediction
Author :
Nagamma, P. ; Pruthvi, H.R. ; Nisha, K.K. ; Shwetha, N.H.
Author_Institution :
Dept. of Inf. Technol., Nat. Inst. of Technol., Mangalore, India
Abstract :
With the rapid development of E-commerce, more online reviews for products and services are created, which form an important source of information for both sellers and customers. Research on sentiment and opinion mining for online review analysis has attracted increasingly more attention because such study helps leverage information from online reviews for potential economic impact. The paper discusses applying sentiment analysis and machine learning methods to study the relationship between the online reviews for a movie and the movies box office revenue performance. The paper shows that a simplified version of the sentiment-aware autoregressive model can produce very good accuracy for predicting the box office sale using online review data. Document level sentiment analysis is used which consists of Term Frequency (TF) and Inverse Document Frequency (IDF) values as features along with Fuzzy Clustering which results in positive and negative sentiments. This lead to the creation of a simpler model which could be more efficient to train and use. In addition, a classification model is created using Support Vector Machine (SVM) Classifier for predicting the trend of the box office revenue from the review sentiment.
Keywords :
learning (artificial intelligence); pattern classification; support vector machines; text analysis; IDF; SVM classifier; TF; box-office prediction; document level sentiment analysis; e-commerce; fuzzy clustering; improved sentiment analysis; inverse document frequency values; machine learning methods; online movie review analysis; opinion mining; sentiment-aware autoregressive model; support vector machine classifier; term frequency values; Accuracy; Data mining; Mathematical model; Motion pictures; Predictive models; Sentiment analysis; Support vector machines;
Conference_Titel :
Computing, Communication & Automation (ICCCA), 2015 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-8889-1
DOI :
10.1109/CCAA.2015.7148530