Title :
User-annotated microtext data for modeling and analyzing users´ sociolinguistic characteristics and age grading
Author :
Moseley, Nathaniel ; Alm, Cecilia Ovesdotter ; Rege, Manjeet
Author_Institution :
Dept. of Comput. Sci., Rochester Inst. of Technol., Rochester, NY, USA
Abstract :
Information from Twitter messages have become an important area for research in computational analysis of natural language. As yet, much latent user attribute analysis on Twitter is unexplored. One reason is that only few latent attributes are explicitly defined by users on Twitter. This work presents and analyzes a data set annotated by Twitter users themselves for age and other useful attributes for use in latent attribute inference applications. We report on statistical analysis of the collected latent attributes and tweet information using association mining.
Keywords :
Internet; age issues; computational linguistics; data analysis; data mining; inference mechanisms; natural language processing; social networking (online); statistical analysis; text analysis; Internet; Twitter messages; abbreviation transformations; age grading; association mining; computational natural language analysis; data set analysis; latent attribute inference applications; latent user attribute analysis; statistical analysis; user sociolinguistic characteristics modeling; user-annotated microtext data; users sociolinguistic characteristics analysis; Algorithm design and analysis; Analytical models; Medical services; Abbreviation Transformations; Association Mining; Latent User Annotation; Microblog Dataset;
Conference_Titel :
Research Challenges in Information Science (RCIS), 2014 IEEE Eighth International Conference on
Conference_Location :
Marrakech
DOI :
10.1109/RCIS.2014.6861046