Title :
Analyze and detect malicious code for compound document binary storage format
Author :
Gao, Yu-xiang ; De-yu Qi
Author_Institution :
Res. Inst. of Comput. Syst., South China Univ. of Technol., Guangzhou, China
Abstract :
Comparing traditional malicious attack, embedding malicious codes into documents is becoming a more efficient and hidden way. The attackers embed the malicious codes into a document based on the document storage format so that they activate secretively when the document is opened by third-party software. With a simple action of double click the document, it could bring a nightmare to the user. Through researching and analyzing the structure of compound file, we mainly focus on the Word documents, and try to find out a method to detect them. We have used the bloom filter as well as the entropy rate of Markov chain and reached a high accuracy. Detect embedded malicious codes by analyzing the embedded codes themselves, because they are machine instructions which must can execute by CPU. A basic assumption is that the machine instructions in the document are different from the normal text, pictures, tables, etc. The basic direction of detection is to find the different areas in the document. Thus, we use the entropy rate as a measure to quantify this distinction.
Keywords :
Markov processes; entropy; security of data; word processing; CPU; Markov chain; Word documents; bloom filter; compound document binary storage format; document storage format; entropy rate; machine instructions; malicious attack; malicious code analysis; malicious code detection; third-party software; Accuracy; Compounds; Cybernetics; Entropy; Jacobian matrices; Machine learning; Markov processes; Bloom filter; Document storage format; Entropy rate; Malicious codes; Markov chain;
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2011 International Conference on
Conference_Location :
Guilin
Print_ISBN :
978-1-4577-0305-8
DOI :
10.1109/ICMLC.2011.6016767