DocumentCode :
3268013
Title :
RedJsod: A Readable JavaScript Obfuscation Detector Using Semantic-based Analysis
Author :
AL-Taharwa, Ismail Adel ; Lee, Hahn-Ming ; Jeng, Albert B. ; Wu, Kuo-Ping ; Mao, Ching-Hao ; Wei, Te-En ; Chen, Shyi-Ming
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ. of Sci. & Technol., Taipei, Taiwan
fYear :
2012
fDate :
25-27 June 2012
Firstpage :
1370
Lastpage :
1375
Abstract :
JavaScript allows Web-developers to hide intention behind their code inside different looking scripts known as Obfuscated code. Automatic detection of obfuscated code is generally tackled from readability perspective. However, recently obfuscation exhibits patterns that modify both syntax and semantic characteristics while preserving readability characteristic. There are two problems in dealing with readable obfuscation: 1. Difficulty in locating it since it does not manipulate suspicious strings. 2. It is a common and essential practice adopted in both benign codes and malicious codes. In this work, we first investigate why and how readable obfuscation can hinder detection of maliciousness and prevent the static analysis of suspicious scripts. Next, we propose a readable JavaScript obfuscation detector (RedJsod) system to deal with this type of threat. RedJsod is a well defined detector based on variable length context-based feature extraction (VCLFE) scheme that takes advantages of abstract syntax tree (AST) representation of a given JavaScript code to infer run-time behaviors statically. We applied RedJsod to three datasets collected from real world Web-pages to evaluate its effectiveness. Also, we tested RedJsod on well-known readable obfuscation samples cited in related works as a proof of concept illustration. Our experimental results indicated that RedJsod achieved very high detection rates (greater than 97%) in terms of accuracy, eliminated false negatives completely, while at the same time yielded very few false positives.
Keywords :
Internet; Java; abstract data types; feature extraction; invasive software; program diagnostics; programming language semantics; tree data structures; AST representation; RedJsod; VCLFE scheme; Web pages; abstract syntax tree; automatic obfuscated code detection; benign codes; malicious codes; maliciousness detection; malware; proof of concept; readability characteristics preservation; readable JavaScript obfuscation detector; run-time behaviors; semantic characteristics; semantic-based analysis; static analysis prevention; suspicious scripts; syntax characteristics; threat detection; variable length context-based feature extraction; Abstracts; Context; Context modeling; Detectors; Encoding; Feature extraction; Malware; AST representation; JavaScript malware; detection; encoding; feature-based; obfuscation; static analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Trust, Security and Privacy in Computing and Communications (TrustCom), 2012 IEEE 11th International Conference on
Conference_Location :
Liverpool
Print_ISBN :
978-1-4673-2172-3
Type :
conf
DOI :
10.1109/TrustCom.2012.235
Filename :
6296140
Link To Document :
بازگشت