Automated, reliable zero-day malware detection based on autoencoding architecture

C Kim, SY Chang, J Kim, D Lee… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
C Kim, SY Chang, J Kim, D Lee, J Kim
IEEE Transactions on Network and Service Management, 2023ieeexplore.ieee.org
While a body of studies has been carried out for malware detection with its significance, they
are often limited to known malware patterns due to the reliance on signature-based or
supervised learning approaches. The semi-supervised learning approach would be an
option for identifying previously unseen patterns (ie, zero-day detection); however, our
preliminary study reveals critical limitations from existing methods, including (i) the profiling-
based approach using an autoencoder can provide better detection but is sensitive to the …
While a body of studies has been carried out for malware detection with its significance, they are often limited to known malware patterns due to the reliance on signature-based or supervised learning approaches. The semi-supervised learning approach would be an option for identifying previously unseen patterns (i.e., zero-day detection); however, our preliminary study reveals critical limitations from existing methods, including (i) the profiling-based approach using an autoencoder can provide better detection but is sensitive to the threshold setting, and (ii) one-class (OC) classification does not require a manual threshold discovery but may be limited with low detection rates. In this paper, we present a new detection method incorporating the concept of autoencoding and OC classification, designed to benefit from strong abstraction by neural networks (using an autoencoder) and the removal of the complex threshold selection (using an OC classifier). For this combined architecture, a challenge is concurrent training of the autoencoder and the OC classifier, which may cause an ill-suited learner due to no reference to malware instances. To this end, we introduce a new model selection method that discovers well-optimized models from a variety of combinations. The experimental results performed with public malware datasets (Meraz’18 and Drebin) show the effectiveness of our presented methods with up to 97.1% accuracy, comparable to the supervised learning-based detection. We also examine the impact of evading attacks using adversarial attack tools, the result of which shows resilience to malware variants with over 99% detection rates.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果