Title : 
Topic modeling of SSH logs using latent dirichlet allocation for the application in cyber security
         
        
            Author : 
Aswani, Krishna ; Cronin, Aidan ; Xirui Liu ; Heyuan Zhao
         
        
            Author_Institution : 
Univ. of Virginia, Charlottesville, VA, USA
         
        
        
        
        
        
            Abstract : 
Cyber intrusions are one of the main causes of fear across the internet and now, due to the substantial increase in network traffic, detection of each unauthorized access has become extremely difficult. Brute-force attacks are the most common form of malicious traffic. To prevent such attacks and detect them in real time many new techniques have been developed. The majority of these techniques monitor the sequential transfers between users/IPs and the network. However, though many networks are now monitoring their logs and can identify when brute-force attacks occur, they cannot provide more detailed information about the attack (such as where and how) without some form of direct visual inspection of the logs. In this paper, we explore a Latent Dirichlet Allocation as a form of topic modeling of IP addresses through SSH authentication logs with the final goal of automating classifications of users. Using textual topics or the “top words” associated with logs, we differentiate legitimate users and brute-attackers users according to their IP addresses and discuss the potential of topic modelling for identifying and further classification of cyber threats.
         
        
            Keywords : 
IP networks; Internet; authorisation; computer network security; pattern classification; telecommunication traffic; IP addresses; Internet; SSH authentication logs; brute-force attacks; cyber intrusions; cyber security; cyber threats classification; direct visual inspection; latent dirichlet allocation; malicious traffic; network traffic; sequential transfers; textual topics; topic modeling; unauthorized access; Data models; Feature extraction; Force; Hidden Markov models; IP networks; Resource management; Servers; Brute-force attacks; LDA; SSH logs; Topic model;
         
        
        
        
            Conference_Titel : 
Systems and Information Engineering Design Symposium (SIEDS), 2015
         
        
            Conference_Location : 
Charlottesville, VA
         
        
            Print_ISBN : 
978-1-4799-1831-7
         
        
        
            DOI : 
10.1109/SIEDS.2015.7117015