作者
Abdullah Al-Alyan, Saad Al-Ahmadi
发表日期
2020/7/1
期刊
KSII Transactions on Internet & Information Systems
卷号
14
期号
7
简介
Phishing websites can have devastating effects on governmental, financial, and social services, as well as on individual privacy. Currently, many phishing detection solutions are evaluated using small datasets and, thus, are prone to sampling issues, such as representing legitimate websites by only high-ranking websites, which could make their evaluation less relevant in practice. Phishing detection solutions which depend only on the URL are attractive, as they can be used in limited systems, such as with firewalls. In this paper, we present a URL-only phishing detection solution based on a convolutional neural network (CNN) model. The proposed CNN takes the URL as the input, rather than using predetermined features such as URL length. For training and evaluation, we have collected over two million URLs in a massive URL phishing detection (MUPD) dataset. We split MUPD into training, validation and testing datasets. The proposed CNN achieves approximately 96% accuracy on the testing dataset; this accuracy is achieved with URL schemes (such as HTTP and HTTPS) removed from the URL. Our proposed solution achieved better accuracy compared to an existing state-of-the-art URL-only model on a published dataset. Finally, the results of our experiment suggest keeping the CNN up-to-date for better results in practice.
引用总数
20212022202320241881
学术搜索中的文章
A Al-Alyan, S Al-Ahmadi - KSII Transactions on Internet & Information Systems, 2020