Title :
Text message corpus: Applying natural language processing to mobile device forensics
Author :
O´Day, Daniel R. ; Calix, Ricardo A.
Author_Institution :
Purdue Univ. Calumet, Hammond, IN, USA
Abstract :
The average mobile device user sends a large quantity of text and other short messages. These text message data are of great value to law enforcement investigators who may be analyzing a suspect´s mobile device or social media profile for evidence of criminal activity. Current tools and methodologies for analyzing text and other short message data generally only allow for simple keyword searches, which is often a time-consuming task for law enforcement investigators. In addition, there are limited corpora available containing text message data. An initial corpus of text message data for experimental purposes has been developed and made available to the research community. A simple methodology is proposed for feature extraction. The format of the data is given as well as basic statistics, suggestions for possible use, and future work.
Keywords :
digital forensics; electronic messaging; feature extraction; law; mobile computing; mobile handsets; natural language processing; text analysis; criminal activity; feature extraction; keyword searching; law enforcement investigation; mobile device forensics; natural language processing; short message; social media profile analysis; text analysis; text message corpus; Classification algorithms; Data mining; Feature extraction; Forensics; Law enforcement; Mobile handsets; Natural language processing; Natural Language Processing; Semantic Analysis; Short Message Analysis; Text Message Corpus;
Conference_Titel :
Multimedia and Expo Workshops (ICMEW), 2013 IEEE International Conference on
Conference_Location :
San Jose, CA
DOI :
10.1109/ICMEW.2013.6618380