Title of article
Discovering filter keywords for company name disambiguation in twitter
Author/Authors
Spina، نويسنده , , Damiano and Gonzalo، نويسنده , , Julio and Amigَ، نويسنده , , Enrique، نويسنده ,
Issue Information
روزنامه با شماره پیاپی سال 2013
Pages
18
From page
4986
To page
5003
Abstract
A major problem in monitoring the online reputation of companies, brands, and other entities is that entity names are often ambiguous (apple may refer to the company, the fruit, the singer, etc.). The problem is particularly hard in microblogging services such as Twitter, where texts are very short and there is little context to disambiguate. In this paper we address the filtering task of determining, out of a set of tweets that contain a company name, which ones do refer to the company. Our approach relies on the identification of filter keywords: those whose presence in a tweet reliably confirm (positive keywords) or discard (negative keywords) that the tweet refers to the company.
cribe an algorithm to extract filter keywords that does not use any previously annotated data about the target company. The algorithm allows to classify 58% of the tweets with 75% accuracy; and those can be used to feed a machine learning algorithm to obtain a complete classification of all tweets with an overall accuracy of 73%. In comparison, a 10-fold validation of the same machine learning algorithm provides an accuracy of 85%, i.e., our unsupervised algorithm has a 14% loss with respect to its supervised counterpart.
udy also shows that (i) filter keywords for Twitter does not directly derive from the public information about the company in the Web: a manual selection of keywords from relevant web sources only covers 15% of the tweets with 86% accuracy; (ii) filter keywords can indeed be a productive way of classifying tweets: the five best possible keywords cover, in average, 28% of the tweets for a company in our test collection.
Keywords
Twitter , filtering , name disambiguation , Online reputation management
Journal title
Expert Systems with Applications
Serial Year
2013
Journal title
Expert Systems with Applications
Record number
2353743
Link To Document