DocumentCode :
2493202
Title :
A robust and scalable framework for detecting self-reported illness from twitter
Author :
Khan, Muhammad Asif Hossain ; Iwai, Masayuki ; Sezaki, Kaoru
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
fYear :
2012
fDate :
10-13 Oct. 2012
Firstpage :
303
Lastpage :
308
Abstract :
Early detection of onset and outbreak of infectious diseases has paramount importance in containing such diseases before they turn into epidemics. The incredible growth in popularity and spatial resolution of coverage have made micro-blogging sites like Twitter a promising source of information for assessing the evolution of intensity of such diseases within a locality. However, identifying tweets with self-reported illness from other `disease related´ tweets is important for avoiding false alarms. In this research, our endeavor is to segregate the tweets all of which fall under the general category of `disease related´. By using relatively very small training set and modifying the conventional n-gram feature selection method, we could isolate tweets reporting individual´s illness with around 88.7% precision.
Keywords :
diseases; feature extraction; medical computing; social networking (online); Twitter; disease intensity evolution; disease related tweets; infectious disease onset detection; infectious disease outbreak detection; microblogging sites; n-gram feature selection method; self reported illness detection; Accuracy; Diseases; Educational institutions; Electronic mail; Noise measurement; Training; Twitter; Collective intelligence; Epidemic intelligence; Infodemiology; Short text classification; Trend analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Health Networking, Applications and Services (Healthcom), 2012 IEEE 14th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-2039-0
Electronic_ISBN :
978-1-4577-2038-3
Type :
conf
DOI :
10.1109/HealthCom.2012.6379425
Filename :
6379425
Link To Document :
بازگشت