A lightweight double-stage scheme to identify malicious DNS over HTTPS traffic using a hybrid learning approach

Q Abu Al-Haija, M Alohaly, A Odeh - Sensors, 2023 - mdpi.com
Sensors, 2023mdpi.com
The Domain Name System (DNS) protocol essentially translates domain names to IP
addresses, enabling browsers to load and utilize Internet resources. Despite its major role,
DNS is vulnerable to various security loopholes that attackers have continually abused.
Therefore, delivering secure DNS traffic has become challenging since attackers use
advanced and fast malicious information-stealing approaches. To overcome DNS
vulnerabilities, the DNS over HTTPS (DoH) protocol was introduced to improve the security …
The Domain Name System (DNS) protocol essentially translates domain names to IP addresses, enabling browsers to load and utilize Internet resources. Despite its major role, DNS is vulnerable to various security loopholes that attackers have continually abused. Therefore, delivering secure DNS traffic has become challenging since attackers use advanced and fast malicious information-stealing approaches. To overcome DNS vulnerabilities, the DNS over HTTPS (DoH) protocol was introduced to improve the security of the DNS protocol by encrypting the DNS traffic and communicating it over a covert network channel. This paper proposes a lightweight, double-stage scheme to identify malicious DoH traffic using a hybrid learning approach. The system comprises two layers. At the first layer, the traffic is examined using random fine trees (RF) and identified as DoH traffic or non-DoH traffic. At the second layer, the DoH traffic is further investigated using Adaboost trees (ADT) and identified as benign DoH or malicious DoH. Specifically, the proposed system is lightweight since it works with the least number of features (using only six out of thirty-three features) selected using principal component analysis (PCA) and minimizes the number of samples produced using a random under-sampling (RUS) approach. The experiential evaluation reported a high-performance system with a predictive accuracy of 99.4% and 100% and a predictive overhead of 0.83 µs and 2.27 µs for layer one and layer two, respectively. Hence, the reported results are superior and surpass existing models, given that our proposed model uses only 18% of the feature set and 17% of the sample set, distributed in balanced classes.
MDPI
以上显示的是最相近的搜索结果。 查看全部搜索结果