A Novel Scoring Model to Detect Potential Malicious Web Pages

Author

Le, Van Lam ; Welch, Ian ; Gao, Xiaoying ; Komisarczuk, Peter

Author_Institution

Sch. of Eng. & Comput. Sci., Victoria Univ. of Wellington, Wellington, New Zealand

fYear

2012

fDate

25-27 June 2012

Firstpage

254

Lastpage

263

Abstract

Malicious web pages have embedded within them active contents that exploit vulnerabilities in users´ browsers and plug-ins in order to compromise the users´ machines. Approaches from research into identifying malicious web pages can be classified into two groups depending upon the types of web page features used: either run-time features based upon observing what happens when the web page is loaded (slow but accurate) or static features based upon the content, structure or property of the web page (fast but inaccurate). Hybrid approaches combine the best of both to provide scalable systems with good accuracy by using the static feature based approach as a pre-filter for the run-time feature based approach. One of critical challenges for such hybrid approaches is to build effective pre-filter which has a capability to make the trade-off between reducing number of web pages passed through to the run-time feature detector and misidentifying malicious web pages as benign. This paper presents a novel scoring model to filter potential malicious web pages by using static features from various sources of information about malicious web pages, finding suitable algorithms to score maliciousness of each source of information, and finally finding the best ways to combine scores from different sources of information in order to achieve the best accuracy. The result shows that our novel scoring model can combine knowledge from various sources of information about web pages very effectively in order to filter potential malicious web pages.

Keywords

Internet; security of data; potential malicious web pages detection; run-time feature based approach; scoring model; static feature based approach; Browsers; Computational modeling; Euclidean distance; Feature extraction; Training; Vectors; Web pages; Drive-by-download; Internet Security; malicious web page;

fLanguage

English

Publisher

ieee

Conference_Titel

Trust, Security and Privacy in Computing and Communications (TrustCom), 2012 IEEE 11th International Conference on

Conference_Location

Liverpool

Print_ISBN

978-1-4673-2172-3

Type

conf

DOI

10.1109/TrustCom.2012.44

Filename

6295983