[HTML][HTML] Crawl-shing: A focused crawler for fetching phishing contents based on graph isomorphism

F Tchakounte, JCT Ngnintedem, I Damakoa… - Journal of King Saud …, 2022 - Elsevier
F Tchakounte, JCT Ngnintedem, I Damakoa, F Ahmadou, FAK Fotso
Journal of King Saud University-Computer and Information Sciences, 2022Elsevier
The Web is fed with experiences about phishing, which aim at sharing phisher techniques
and behaviours. We argue that this textual information can be turned into knowledge,
exploitable to prevent such attacks. Unlike anti-phishing works that aim at detecting phishing
traces, this work is the first attempt to design a tool to retrieve web pages which have
phishing contents. The expected crawler is dedicated to extract phishing feeds. Existing
crawlers mainly rely on building vector space models (VSM) from pages while exploiting …
Abstract
The Web is fed with experiences about phishing, which aim at sharing phisher techniques and behaviours. We argue that this textual information can be turned into knowledge, exploitable to prevent such attacks. Unlike anti-phishing works that aim at detecting phishing traces, this work is the first attempt to design a tool to retrieve web pages which have phishing contents. The expected crawler is dedicated to extract phishing feeds. Existing crawlers mainly rely on building vector space models (VSM) from pages while exploiting Term frequency inverse - document frequency (Tf-idf) and cosine similarity to compute Web page similarities related to a given query. Considering the fact that vector modelling ignores the order of appearance of terms in the document as well as proximity and the connections between terms, we introduce Crawl-shing, an improved search model based on isomorphic graphs, which given two documents evaluate their similarity degree by seeking the largest common subgraph. Experimental results with phishing Web pages show that Crawl-shing presents a harvest rate better than the Breadth First Search (BFS) approach. Crawl-shing has been found more precise during exploration compared to the approaches based on vector modelling.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果