Title :
A Probabilistic Generative Model for Mining Cybercriminal Networks from Online Social Media
Author :
Lau, Raymond Y. K. ; Yunqing Xia ; Yunming Ye
Author_Institution :
Dept. of Inf. Syst., City Univ. of Hong Kong, Hong Kong, China
Abstract :
There has been a rapid growth in the number of cybercrimes that cause tremendous financial loss to organizations. Recent studies reveal that cybercriminals tend to collaborate or even transact cyber-attack tools via the "dark markets" established in online social media. Accordingly, it presents unprecedented opportunities for researchers to tap into these underground cybercriminal communities to develop better insights about collaborative cybercrime activities so as to combat the ever increasing number of cybercrimes. The main contribution of this paper is the development of a novel weakly supervised cybercriminal network mining method to facilitate cybercrime forensics. In particular, the proposed method is underpinned by a probabilistic generative model enhanced by a novel context-sensitive Gibbs sampling algorithm. Evaluated based on two social media corpora, our experimental results reveal that the proposed method significantly outperforms the Latent Dirichlet Allocation (LDA) based method and the Support Vector Machine (SVM) based method by 5.23% and 16.62% in terms of Area Under the ROC Curve (AUC), respectively. It also achieves comparable performance as the state-of-the-art Partially Labeled Dirichlet Allocation (PLDA) method. To the best of our knowledge, this is the first successful research of applying a probabilistic generative model to mine cybercriminal networks from online social media.
Keywords :
data mining; digital forensics; sampling methods; social networking (online); AUC; PLDA method; area under the ROC curve; collaborative cybercrime activities; context-sensitive Gibbs sampling algorithm; cyber-attack tools; cybercrime forensics; dark markets; online social media; partially labeled Dirichlet allocation method; probabilistic generative model; social media corpora; supervised cybercriminal network mining method; underground cybercriminal communities; Computer crime; Computer security; Data mining; Hackers; Natural language processing; Network security; Probabilstic logic; Social network services;
Journal_Title :
Computational Intelligence Magazine, IEEE
DOI :
10.1109/MCI.2013.2291689