Title :
Modeling and detecting anomalous topic access
Author :
Gupta, Swastik ; Hanson, Catherine ; Gunter, Carl A. ; Frank, Michael ; Liebovitz, David ; Malin, Bradley
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Abstract :
There has been considerable success in developing strategies to detect insider threats in information systems based on what one might call the random object access model or ROA. This approach models illegitimate users as ones who randomly access records. The goal is to use statistics, machine learning, knowledge of workflows and other techniques to support an anomaly detection framework that finds such users. In this paper we introduce and study a random topic access model or RTA aimed at users whose access may be illegitimate but is not fully random because it is focused on common semantic themes. We argue that this model is appropriate for a meaningful range of attacks and develop a system based on topic summarization that is able to formalize the model and provide anomalous user detection effectively for it. To this end, we use healthcare as an example and propose a framework for evaluating the ability to recognize various types of random users called random topic access detection or RTAD. Specifically, we utilize a combination of Latent Dirichlet Allocation (LDA), for feature extraction, a k-nearest neighbor (k-NN) algorithm for outlier detection and evaluate the ability to identify different adversarial types. We validate the technique in the context of hospital audit logs where we show varying degrees of success based on user roles and the anticipated characteristics of attackers. In particular, it was found that RTAD exhibits strong performance for roles are described by a few topics, but weaker performance when users are more topic-agnostic.
Keywords :
data mining; feature extraction; health care; learning (artificial intelligence); medical information systems; random processes; security of data; LDA; ROA model; RTA; RTAD; anomalous topic access detection; anomalous topic access modeling; anomalous user detection; anomaly detection framework; attacker characteristics; feature extraction; healthcare; hospital audit logs; information systems; insider threat detection; k-NN algorithm; k-nearest neighbor algorithm; latent Dirichlet allocation; machine learning; model formalization; outlier detection; random object access model; random record access; random topic access detection; random topic access model; random users; statistics; strategy development; topic summarization; user roles; workflow knowledge; Context; Hospitals; Information systems; Medical diagnostic imaging; Sociology; Statistics; Access Logs; Anomaly Detection; Data Mining; Electronic Health Records; Healthcare Security; Insider threats;
Conference_Titel :
Intelligence and Security Informatics (ISI), 2013 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
978-1-4673-6214-6
DOI :
10.1109/ISI.2013.6578795