Title :
Empirical Evaluation of Profile Characteristics for Gender Classification on Twitter
Author :
Alowibdi, Jalal S. ; Buy, Ugo A. ; Yu, Paul
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Chicago, Chicago, IL, USA
Abstract :
Online Social Networks (OSNs) provide reliable communication among users from different countries. The volume of texts generated by OSNs is huge and highly informative. Gender classification can serve commercial organizations for advertising, law enforcement for legal investigation, and others for social reasons. Here we explore profile characteristics for gender classification on Twitter. Unlike existing approaches to gender classification that depend heavily on posted text such as tweets, here we study the relative strengths of different characteristics extracted from Twitter profiles (e.g., first name and background color in a user´s profile page). Our goal is to evaluate profile characteristics with respect to their predictive accuracy and computational complexity. In addition, we provide a novel technique to reduce the number of features of text-based profile characteristics from the order of millions to a few thousands and, in some cases, to only 40 features. We prove the validity of our approach by examining different classifiers over a large dataset of Twitter profiles.
Keywords :
computational complexity; computer mediated communication; gender issues; pattern classification; social networking (online); OSN; Twitter profiles; background color; commercial organizations; computational complexity; gender classification; informative texts; legal investigation; online social networks; profile characteristics; text-based profile characteristics; user profile page; Accuracy; Color; Image color analysis; Niobium; Quantization (signal); Sorting; Twitter; Color-based features; color quantization; language independence; phonemes as features; profile characteristics; social networks;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2013 12th International Conference on
Conference_Location :
Miami, FL
DOI :
10.1109/ICMLA.2013.74