DocumentCode :
3571530
Title :
Mining the Demographics of Craigslist Casual Sex Ads to Inform Public Health Policy
Author :
Fries, Jason A. ; Polgreen, Philip M. ; Segre, Alberto M.
Author_Institution :
Dept. of Comput. Sci., Univ. of Iowa Iowa City, Iowa City, IA, USA
fYear :
2014
Firstpage :
61
Lastpage :
70
Abstract :
Anonymous sexual encounters negotiated via the Internet present many challenges to public health officials addressing outbreaks of sexually transmitted infections. The anonymity and potential geographic scale of encounters weaken traditional tools like contact tracing and partner notification. These developments complicate interventions within the men who have sex with men (MSM) population, which has seen increasing health disparities in HIV and syphilis incidence rates over the last decade. This paper presents text-mining methods for conducting public health surveillance of the anonymous MSM populations using the online classified advertisement website Craig list to negotiate casual sexual encounters. We analyze 2.5 years of Craig list data (134 million ads) and present machine learning and rule-based approaches for efficiently mining race/ethnicity and age information from Craig list text. Using previous work in geographic entity recognition, we link ads with specific locations and generate Craig list MSM summary statistics for race/ethnicity and age cohorts in urban and rural geographic areas. This data is then compared to demographic information from the 2010 U.S. Census to quantify how well it reflects the known, underlying population. We find significant correlations between Craig list and census population statistics, suggesting our approach´s utility for surveillance applications.
Keywords :
Internet; Web sites; advertising; data mining; demography; diseases; health care; knowledge based systems; learning (artificial intelligence); medical information systems; statistical analysis; Craig list MSM summary statistics; Craigslist casual sex ads; HIV; Internet; age information; anonymous MSM population; anonymous sexual encounter; census population statistics; contact tracing; demographics; geographic entity recognition; geographic scale; health disparities; machine learning; men-who-have-sex-with-men population; online classified advertisement Web site; partner notification; public health policy; public health surveillance; race-ethnicity; rule-based approach; rural geographic area; sexually transmitted infection; syphilis incidence rate; text-mining; urban geographic area; Cities and towns; Human immunodeficiency virus; Internet; Public healthcare; Sociology; Statistics; Terminology; Knowledge discovery; Natural language processing; Public healthcare; Supervised learning; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Healthcare Informatics (ICHI), 2014 IEEE International Conference on
Type :
conf
DOI :
10.1109/ICHI.2014.16
Filename :
7052471
Link To Document :
بازگشت