DocumentCode :
2663911
Title :
Improving the classification of terrorist attacks a study on data pre-processing for mining the Global Terrorism Database
Author :
Pagán, José V.
Author_Institution :
Electr. & Comput. Eng. & Comput. Sci. Dept., Polytech. Univ. of Puerto Rico, San Juan, Puerto Rico
Volume :
1
fYear :
2010
fDate :
3-5 Oct. 2010
Abstract :
The objective of this paper is to analyze different data preprocessing techniques for mining the Global Terrorism Database to improve the classification of terrorist attacks by perpetrator in Iraq. Four methods for dealing with missing values (Case Deletion, Mean Imputation, Median Imputation and KNN imputation), three discretization methods (1R, Entropy and Equal Width), and three different classifiers (Linear Discriminant Analysis, K-Nearest Neighbor and Recursive Partitioning) are evaluated using ten-fold cross-validation estimates of the misclassification error. The study concludes (i) that data preprocessing can significantly reduce the classification error rate for this dataset, and (ii) that adding Global Positioning System coordinates for the location of the incidents can further reduce the classification error rate.
Keywords :
data mining; pattern classification; police data processing; terrorism; 1R discretization method; Iraq; KNN imputation; case deletion; classification error rate; data preprocessing; entropy discretization method; equal width discretization method; global terrorism database mining; k-nearest neighbor; linear discriminant analysis; mean imputation; median imputation; recursive partitioning; terrorist attacks; Data preprocessing; Databases; Entropy; Error analysis; Software; Terrorism; Visualization; classification; data mining; data preprocessing; terrorism;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Technology and Engineering (ICSTE), 2010 2nd International Conference on
Conference_Location :
San Juan, PR
Print_ISBN :
978-1-4244-8667-0
Electronic_ISBN :
978-1-4244-8666-3
Type :
conf
DOI :
10.1109/ICSTE.2010.5608902
Filename :
5608902
Link To Document :
بازگشت