Title :
A Novel Phishing Page Detection Mechanism Using HTML Source Code Comparison and Cosine Similarity
Author :
Roopak, S. ; Thomas, Tessamma
Author_Institution :
Sch. of Comput. Sci. & IT, Indian Inst. of Inf. Technol. & Manage., Thiruvananthapuram, India
Abstract :
Phishing is a social engineering technique used by hackers to steal information and sometimes money from online users. Phishing web sites are imitating sites of other legitimate web sites. Our aim is to detect the phishing pages and block it. In this paper, we propose a novel method for detecting phishing pages by searching the similar web pages through mining the web and compares them by matching the HTML source codes as well as computing the cosine similarity of their textual contents. We then developed a browser capable of detecting phishing pages. The browser is tested with more than 20 phishing sites from Phishtank.com with different tag match percentage and cosine similarity values. The results indicate that the detection rate of the proposed mechanism is high compared to the other existing methods.
Keywords :
Web sites; computer crime; hypermedia markup languages; source code (software); HTML source code comparison; Phishtank.com; Web pages; Web sites; cosine similarity; hackers; information stealing; phishing page detection mechanism; social engineering technique; Browsers; Electronic mail; Google; HTML; IP networks; Web pages; cosine similarity; social engineering; web mining;
Conference_Titel :
Advances in Computing and Communications (ICACC), 2014 Fourth International Conference on
Conference_Location :
Cochin
Print_ISBN :
978-1-4799-4364-7
DOI :
10.1109/ICACC.2014.47