Using Political Party Affiliation Data to Measure Civil Servants´ Risk of Corruption

Author

Carvalho, Rommel ; Carvalho, Rommel ; Ladeira, Marcelo ; Mendes Monteiro, Fernando ; Mendes, Gilson

Author_Institution

Dept. of Strategic Inf., Brazilian Office of the Comptroller Gen., Brasilia, Brazil

fYear

2014

fDate

18-22 Oct. 2014

Firstpage

166

Lastpage

171

Abstract

This paper presents a case study of machine learning applied to measure the risk of corruption of civil servants using political party affiliation data. Initially, a statistical hypothesis test verified the dependency between corruption and political party affiliation. Then, we constructed datasets with standardization and three different discrimination techniques. Using Weka environment, this work shows the application and statistical evaluation of four classification algorithms to build models for predicting risk of corruption: Bayesian Networks, Support Vector Machines, Random Forest, and Artificial Neural Networks with back propagation. To evaluate the models we used data mining metrics such as precision, recall, kappa statistic and percent correct. Lastly, the case study compares the learned model with the best performance to the experts´ model. The comparison not only confirms previous experts´ affirmations, but also provides new assertions on the affiliation-corruptibility relation.

Keywords

backpropagation; belief networks; data mining; neural nets; pattern classification; politics; standardisation; statistical testing; support vector machines; Bayesian network classification algorithm; Weka environment; affiliation-corruptibility relation; artificial neural network classification algorithm; backpropagation; civil servant corruption risk measurement; corruption risk prediction; data mining metrics; discretization techniques; expert affirmations; expert model; kappa statistic; learned model; machine learning; percent correct; political party affiliation data; precision; random forest classification algorithm; recall; standardization; statistical evaluation; statistical hypothesis test; support vector machine classification algorithm; Artificial neural networks; Bayes methods; Data mining; Measurement; Prediction algorithms; Radio frequency; Support vector machines; Civil Servant; Corruption; Machine Learning; Political Affiliation; Random Forest;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Systems (BRACIS), 2014 Brazilian Conference on

Conference_Location

Sao Paulo

Type

conf

DOI

10.1109/BRACIS.2014.39

Filename

6984825