DocumentCode
1173930
Title
Applying authorship analysis to extremist-group Web forum messages
Author
Abbasi, Ahmed ; Chen, Hsinchun
Author_Institution
Dept. of Manage. Inf. Syst., Arizona Univ., Tucson, AZ, USA
Volume
20
Issue
5
fYear
2005
Firstpage
67
Lastpage
75
Abstract
The speed, ubiquity, and potential anonymity of Internet media - email, Web sites, and Internet forums - make them ideal communication channels for militant groups and terrorist organizations. Analyzing Web content has therefore become increasingly important to the intelligence and security agencies that monitor these groups. Authorship analysis can assist this activity by automatically extracting linguistic features from online messages and evaluating stylistic details for patterns of terrorist communication. However, authorship analysis techniques are rooted in work with literary texts, which differ significantly from online communication. To explore these problems, we modified an existing framework for analyzing online authorship and applied it to Arabic and English Web forum messages associated with known extremist groups. We developed a special multilingual model - the set of algorithms and related features - to identify Arabic messages, gearing this model toward the language´s unique characteristics. Furthermore, we incorporated a complex message extraction component to allow the use of a more comprehensive set of features tailored specifically toward online messages. Evaluating the linguistic features of Web messages and comparing them to known writing styles offers the intelligence community a tool for identifying patterns of terrorist communication.
Keywords
Internet; authoring systems; feature extraction; linguistics; natural languages; social aspects of automation; terrorism; Arabic Web forum messages; English Web forum messages; Internet forums; Web content analysis; Web sites; automatic linguistic feature extraction; email; extremist group Web forum messages; message extraction component; multilingual model; online authorship analysis; pattern identification tool; terrorist communication; Algorithm design and analysis; Communication channels; Discussion forums; Feature extraction; Monitoring; Pattern analysis; Security; Terrorism; Vocabulary; Writing; Web content analysis; Web forum postings; Web mining; authorship analysis; multilingual; security; text analysis;
fLanguage
English
Journal_Title
Intelligent Systems, IEEE
Publisher
ieee
ISSN
1541-1672
Type
jour
DOI
10.1109/MIS.2005.81
Filename
1512002
Link To Document