Title :
Malware variant detection using similarity search over content fingerprint
Author :
Ban Xiaofang ; Chen Li ; Hu Weihua ; Wu Qu
Author_Institution :
China Inf. Technol. Security Evaluation Center, Beijing, China
fDate :
May 31 2014-June 2 2014
Abstract :
Detection of polymorphic malware variants plays an important role to improve information system security. Traditional static/dynamic analysis technologies have shown to be an effective characteristic that represents polymorphic malware instances. While these approaches demonstrate promise, they are themselves subject to a growing array of countermeasures that increase the cost of capturing these malware code features. Further, feature extraction requires a time investment per malware that does not scale well to the daily volume of malwares being reported by those who diligently collect malware. In this paper, we propose a similarity search of malware using novel distance (similarity) metrics of malware content fingerprint based on the locality-sensitive hashing (LSH) schemes. We describe a malware by the binary content of the malware contains; the next step is to compute an feature fingerprint for the malware binary image sample by using the SURF algorithm, and then do fast fingerprint matching with the LSH from malware code corpus to return the top most visually (structurally) similar variants. The LSH algorithm that captures malware similarity is based on image similarity. We implement B2M (Binary mapping to image) algorithm, the SURF algorithm and the LSH algorithm in a complete malware variant detection system. The evaluation shows that our approach is highly effective in terms of response time and malware variant detection.
Keywords :
cryptography; feature extraction; fingerprint identification; image coding; image matching; invasive software; B2M; LSH; SURF algorithm; binary mapping to image algorithm; content fingerprint; distance metrics; fast fingerprint matching; feature extraction; feature fingerprint; image similarity; information system security; locality-sensitive hashing schemes; malware binary image; malware code corpus; malware code features; malware variant detection; similarity search; Algorithm design and analysis; Data visualization; Feature extraction; Fingerprint recognition; Force; Malware; Vectors; Content Fingerprint; Locality-sensitive Hashing; Malware Variant Detection; Similarity Search;
Conference_Titel :
Control and Decision Conference (2014 CCDC), The 26th Chinese
Conference_Location :
Changsha
Print_ISBN :
978-1-4799-3707-3
DOI :
10.1109/CCDC.2014.6852216