1. Introduction
The recent development in communications and information technologies, such as the Internet of Things (IoT), has extraordinarily surpassed the traditional sensing of nearby environments. IoT technologies have facilitated the development of systems that can improve life quality. IoT is one of the fastest-growing technologies in computing, with an estimated 50 billion devices by the end of 2020 [
1]. It has been estimated that, by the year 2025, the IoT and related applications have a potential economic impact of
$3.9 trillion to
$11.1 trillion per year [
2]. The IoT devices can become smart objects by taking advantage of its core technologies like communication technologies, pervasive and ubiquitous computing, embedded devices, Internet protocols, sensor networks, and Artificial Intelligence (AI)-based applications [
3].
The ubiquitous interconnection of physically distributed IoT devices extends the computation and communication to other IoT devices with different specifications [
4]. Multiple types of sensors, embedded in these devices, enable them to gather real-time data from the physical devices remotely. The collected data from the devices allows us to make intelligent decision systems as well as effectively managing IoT environments. However, connecting the commonly used real-world devices to the Internet also raises concerns about cybersecurity threats [
5,
6]. Therefore there is a requirement to design and develop intelligent security solutions for the protection of IoT devices and against attacks generated from compromised IoT devices.
1.1. Motivation
While IoT technologies play a vital part in improving real-life smart systems, like smart cities, smart homes, smart healthcare, the large scale and ubiquitous nature of IoT systems has introduced new security challenges [
5,
6,
7]. Furthermore, since IoT devices generally work in an unattended environment, an attacker may physically access these devices with malicious intent [
8,
9]. Also, because IoT devices are connected usually over wireless networks, eavesdropping can be used to access private information from a communication channel [
10,
11]. On top of these security challenges, IoT devices cannot afford the implementation of advanced security features because of their restricted energy and computation resources. Due to the interconnected and interdependent settings of the IoT, new attack surfaces are emerging very regularly [
12,
13]. Thus, IoT systems are more vulnerable as compared to traditional computing systems. This necessitates research in specific detective and preventive techniques for IoT systems to protect against IoT devices based threats.
For protecting IoT systems against cyber threats, another line of defense should be developed in IoT networks. Intrusion Detection Systems (IDSs) fulfill this purpose [
14,
15]. Various surveys have attempted to describe machine learning-based IDSs for protection against IoT networks or compromised IoT devices. The surveys cover research work on IDSs for cloud-based IoT systems [
16], Wireless sensor networks [
17,
18,
19], cyber-physical systems [
20], and mobile ad hoc networks (MANETs) [
21,
22,
23]. However, traditional IDS methods are less effective or insufficient for the security of IoT systems because of their peculiar characteristics mentioned above, in particular, limited energy, ubiquitous, heterogeneity, limited bandwidth capacity and global connectivity. Machine Learning (ML) and Deep Learning (DL) based techniques have recently gained credibility in a successful application for the detection of network attacks including IoT networks. This is because ML/DL based methods can capture benign and anomalous behavior in IoT environments. IoT devices and network traffic can be captured and investigated to learn normal patterns. Any deviation from these normal learned patterns can be used to detect anomalous behavior. Furthermore, ML/DL based methods have been tested to predict new or zero-day attacks. Hence, ML/DL based algorithms provide robust security protocols for designing the security of IoT devices and networks.
Various surveys have discussed different techniques for designing IDS for IoT systems, but most of the aforementioned surveys did not address the implementation of ML or DL techniques as detection mechanisms in IoT networks and their lightweight devices in a comprehensive manner. Some of these studies published in [
24,
25,
26,
27,
28,
29] revealed that the focus was on studying the issues in IoT security generally and their classification in different layers related to applications, network, encryption and authentication, and access controls. A comprehensive study covering a detailed review of ML and DL based techniques for IDSs in IoT networks still needs further systematic analysis and investigation, which is a major focus of this study.
1.2. Scope of This Survey
This survey includes six important areas related to IDSs for IoT systems and networks: (1) IoT architectures and technologies; (2) IoT threats and attack types; (3) IDS architectures and their design; (4) an explanation of ML and DL techniques applied in the design of IDSs; (5) a description of various datasets available to researchers for evaluation of their proposed IDS; and (6) future research challenges and directions.
1.3. Main Contribution
In this paper, a detailed review of network threats from IoT networks and their devices with corresponding ML and DL based attack detection techniques is presented.
Table 1 summarizes a comparison of our survey with the other surveys conducted on IDSs in IoT networks. As described in the table, this survey covers all important aspects on the subject of ML and DL based techniques used for IDS in IoT networks and their systems. The table also shows that other surveys partially cover some of the aspects and there is no single paper that explains all the aspects. The key contributions of this survey are described as follows:
Discussion of IoT architectures and IoT Protocols, covering their technologies, frequency bands, and data rates.
Explanation of vulnerabilities, threat dimensions and attack surfaces of IoT systems, including attack types related to IoT protocols, which are discussed in detail.
Review of ML- and DL-based IDSs, involving their design choices, pros, cons and detection methods, which are covered in detail.
Discussion of the datasets available for network and IoT security-related research, covering the advantages and limitations of each enumerated with details.
Explanation of the applications of ML and DL techniques for developing IDSs in IoT networks and their systems.
Presentation of the current research challenges and their future directions for research in this field.
The organization of the paper is presented as follows. In
Section 2, recent studies conducted related to the anomaly and intrusion detection in IoT networks are discussed. In
Section 3, an overview of IoT systems is presented covering IoT architecture and reference models and IoT protocols.
Section 4 describes various attacks and threats against IoT systems. Following this,
Section 5 discusses IDS architecture, its design choices and various detection methods, including their ML and DL techniques described in
Section 6 and
Section 7, respectively.
Section 8 describes briefly the datasets that are available and used for testing IDS. Finally, the future challenges and paper’s conclusion are provided in
Section 9 and
Section 10, respectively.
2. Current Reviews
Various survey studies have been carried out in the field of IoT security by describing vulnerabilities in IoT systems. However, most of the existing studies on IoT security have not mainly focused on the applications of ML/DL techniques for IoT security.
Table 1 summarizes a comparison of our survey with the other surveys conducted on IDSs in IoT networks. The comparison discusses the contributions of each survey related to the design of IoT-based IDSs.
In [
32], the authors studied the challenges of IoT security at the communication layer. A study in [
33] focused on reviewing IDSs for IoT networks. The work in [
34] covered a brief discussion of the ML technique’s relevance in the context of IoT security and privacy. Moreover, they identified limited bandwidth, computation power and lack of adequate storage as bottlenecks in any implementation of ML-based security solutions for IoT networks. There are other studies [
35,
36], which discussed the feasibility of both ML and data mining techniques to detect intrusions in IoT networks by implementing these techniques in IDSs either through detecting anomalies or classification of traffic. In [
21], the authors highlighted differentials between IDSs running over wired networks and those running over wireless infrastructure, especially IoT networks. Due to fundamental architectural variations, the application of ML techniques in IoT IDSs needs specific treatment related to the type of attacks, underlying protocols (both in communications and networks), and application layer.
Another study published in [
22] discussed the implementation of IDS in the context of MANETs. The authors described that there are three different types of IDS architectures feasible in MANETs. First architecture can be a layered architecture organized in multiple hierarchical layers. Second architecture can be a flat one for deploying in a distributed and cooperative environment. While the third one can be a hybrid of both using mobile agents. Another study [
23] discussed various Intrusion Detection algorithms related to IDS implementation in MANET. According to the authors, these IDS algorithms can be categorized in various categories based on the underlying principle used for the detection of an attack. These principles can either be a rule, statistics, heuristics, signature, state, reputation score, or route used. These techniques were later classified further as anomaly detection, misuse, signature-based, or hybrid techniques. There were other classification criteria proposed by the authors [
23] like real-time/offline, attack types and effectiveness of detection (scalability, reliability, timeliness, etc.).
Another survey presented in [
30], the authors explained a classification of IDS for Wireless Sensor Networks (WSN) based on the deployment model of the IDS agent. The deployment model can be either distributed, central, or a hybrid mode, which is suggested as the best-suited model for WSNs. A similar study [
31] carried out a classification of WSNs based on IDS using the criteria of detection type used by the IDS. The classes identified included anomaly detection, misuse detection and detection based on specifications. Another aspect of cloud-based IoT environment was discussed in [
16], where the authors studied and classified various cloud-based IDSs affecting Confidentiality, Integrity, and Availability (CIA) of cloud computing-based IoT networks. They explained Hypervisor-based IDS, Host-based IDS (HIDS), Network-based IDS (NIDS) and Distributed IDS. In [
30], the authors presented a survey on IoT IDS with a focus on an IDS architecture. The survey covered existing IoT protocols, standards and technologies, IoT security threats, detection types and concludes by suggesting proposed IoT IDS architecture.
The authors in [
39], proposed a novel multi-stage anomaly detection technique based on Boruta Firefly Aided Partitioning Density-Based Spatial Clustering of Applications with Noise (BFA-PDBSCAN). The authors claimed that their proposed technique produced better results in comparison to the related techniques of Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). In [
40], the authors proposed a hybrid data processing model for network anomaly detection that utilizes Grey Wolf Optimization (GWO) and Convolutional Neural Network (CNN) techniques. The authors stated that their model achieved better accuracy and detection rate in comparison to the other state-of-the-art IDSs. In [
41], an anomaly detection method based on a deep autoencoder was used to detect attacks of IoT botnets. The method comprises extracting statistical features from behavioral snapshots of normal IoT device traffic sequences and training of a DL based autoencoder on the extracted features. The reconstruction error for traffic observations is then compared with a threshold to classify them as normal or anomalous. The authors evaluated the proposed detection method on the BASHLITE and Mirai botnets dataset generated using commercial IoT devices. In a recent survey paper published in [
37], learning-based NIDSs for IoT systems were discussed in an overview of ML-based NIDSs for IoT systems.
4. IoT-Based Threats and Attacks
IoT systems suffer from various security risks as compared to conventional computing systems due to several reasons [
15,
47]. First, IoT systems are highly diverse with regards to devices, platforms, communication means and protocols. Second, IoT systems comprise “things” not planned to be connected to the Internet, where control devices are used to link physical systems. Third, there are no well-defined boundaries in IoT systems, which regularly change due to the mobility of users and devices. Forth, IoT systems, or part of them, would be physically insecure. Last but not least, due to the limited energy of IoT devices, it is usually very hard to deploy advanced security techniques and tools on IoT devices.
An IoT network often contains hundreds of nodes with assigned functions ranging from sensing of light, temperature and noise to associated control systems to regulate lighting and heating, ventilation, and air conditioning (HVAC) systems, etc. All these sensors and control systems communicate through different network protocols like Bluetooth, WiFi, ZigBee, etc. An IoT gateway is used to connect these devices to the Internet. Being composed of layers of standards, services and technologies, the IoT environment has privacy and security concerns at each of these layers. While it seems that the IoT environment has similar security concerns to the Internet, cloud and mobile communication networks, there are distinct characteristics that set IoT environments, along with the applications of contemporary security controls [
10]. These can share data, computing capacity limitation and a large number of networked IoT devices.
One instance of the susceptibility of IoT devices to attacks was demonstrated in September 2016, where an IoT botnet built from the Mirai malware—possibly the largest botnet on record—was responsible for a 620 Gbps attack directed towards Brian Krebs’s security blog [
11]. Mirai followed a simple strategy, where it tried a list of 62 common user credentials to get access to digital video recorders, home routers and network-enabled cameras, which generally had fewer defenses than other IoT devices. Later, in the same month, the French webhost OVH (On Vous Héberge) was attacked by the Mirai-based attack, which broke the record for the largest recorded distributed denial of service (DDoS) attack peaking at 1.1 Tbps [
12]. The attack was made possible due to default and weak security configurations. Similarly, in [
49], the authors described the relative ease of compromising various IoT devices, due to flaws in protocol implementations.
The rapid proliferation of IoT based devices is likely to make such networks susceptible to attacks against privacy and security aspects. In [
13], the authors identified various security issues in IoT networks built with commercially available IoT devices like sensors. One example cites a smart watering system that is capable of measuring environmental variables like temperature and humidity, etc. An actuator module was employed for functionality implementation with a web-based user interface. The system was built on an Arduino Uno. The authors described the exposure of such network to spoofing attacks through a software-enabled access point (SoftAP), where an attacker managed all IoT devices in a network to shut down for a while as the SoftAP broadcasts de-authentication packets.
Due to the limited processing capabilities of IoT devices, the hacker made all IoT devices vulnerable in the network to connect to the SoftAP as it appeared to have a stronger signal than the actual access point (AP) with the same service set identifier (SSID). This allowed the compromise of all network communications to eavesdropping and man in the middle (MiTM) attacks. Such attack scenarios built a case for the deployment of IDSs in IoT networks to discover vulnerabilities of IoT devices. The idea of IoT revolves around the intelligent integration of a real physical environment with the Internet to enable interactivity. For this reason, IoT environments have interconnections and dependencies with multiple heterogeneous environments. This exposes each IoT system to cyber threats from each connected environment [
50,
51]. IoT environments face threats from multiple dimensions both from physical and virtual domains.
Figure 7 illustrates multiple threat dimensions of an IoT environment that would be exploited.
Though IoT Security threats can be broadly divided into cyber and physical domains, our survey is mainly concerned with cyber threats, which can take the form of either active or passive attacks. Passive Attacks are characterized by a lack of any alteration to information or its flow, thereby only compromising the confidentiality and privacy of communications. In some cases, a passive attack can enable location tracking of IoT devices [
52,
53,
54]. Active Attacks involve active alteration and modification of information and its flow, but are not limited to device settings, control messages and software components.
One active attack is when the IoT system is used as a vector to launch massive DDoS against Internet systems. IoT systems are a suitable vector for these attacks because of their large numbers and comparative ease of their compromise, due to poor security practices and weak defense mechanisms. Mirai can be used as an example of a botnet attack through for compromising IoT systems [
11,
55,
56]. IoT systems face many threat dimensions from multiple directions, including user interface, cloud services, other interconnected IoT systems associated to sensors and network services [
12], as shown in
Figure 7. A discussion of these dimensions is presented in the following subsections.
4.1. User Interface
Most use cases of IoT systems involve the provision of services to users by IoT systems through some sort of a user interface (mobile, desktop or web application). The case of smart home appliances can be controlled by users through mobile applications. The rapid proliferation of smartphones has provided malicious actors to disguise malicious applications and malware as benign utility mobile applications and publish them through applications to store without being detected [
57,
58]. Also, smartphones can sometimes be hacked through platform vulnerabilities of these devices like Android vulnerabilities. This leads to exposing all information stored on the phone with the possibility of malware compromise. Eavesdropping, location tracking, Denial of Service (DoS)/DDoS, bluejacking and bluesnarfing are attacks enabled through user interface platforms [
59,
60,
61].
4.2. Cloud Services
Though Cloud services and IoT systems lie at two ends of the resource availability spectrum, the two can complement each other to produce an excellent blend of technologies. Cloud services are characterized by ubiquitous access to computing power and storage, etc., which can offset the resource limitations of IoT systems [
62]. The potential of IoT systems can be maximized through integrated use with cloud services to conserve energy and provide all types of services without being constrained by storage and processing power limitations [
63]. Likewise, cloud services can benefit from large deployments of IoT systems through integrated applications [
64]. Such a distributed architecture opens up vulnerable points for many attacks at multiple layers, as explained below.
Authorization Attacks. Through the exploitation of vulnerabilities in data security mechanisms, an attacker may be able to gain unauthorized access to information on both cloud and IoT systems.
Integrity Attacks. Such attacks enable an attacker to compromise the integrity of data through spoofing and bypass the authorization controls to gain direct access to databases.
Compromise of Visualization platform. A vulnerability in the virtualization platform can be exploited by an attacker to bypass security and isolation controls between the host and the guest operating system (OS), resulting in privilege escalation and pivoting attacks [
65].
Confidentiality Attacks. IoT systems, like wearable devices, are used to monitor health-related data of highly confidential nature. Similarly, smart home devices capture sensitive private data of the users. Privacy and confidentiality concerns overshadow the advantages of cloud services. Moreover, multi-tenancy and geographical location of cloud services pose a serious threat to the confidentiality of data through privilege escalation and hacking [
66].
4.3. Connections of Multiple IoT Systems
Various IoT systems are designed to work autonomously and interact with other IoT systems, such as sensors and actuators of smart cars and smart homes, without requiring human involvement. Such an interaction is aimed at achieving an autonomous and collaborative functionality. Smart cars and smart homes can communicate with each other and provide interdependent services and functions. For instance, [
67] described such a scenario where sensing increased temperature by a temperature sensor, coupled with sensing of unplugging of a smart plug, the windows of the room are automatically opened. The window opening actuator would be reachable for an attack as it may manipulate the temperature sensing device through its interface and in turn that compromises the actuator [
67]. This example highlights the fact that the weakest part of interdependent IoT systems can compromise other parts as well.
A large number of interconnected devices in IoT systems increases the vulnerability and also the impact of any attack, where one compromised device can lead to the compromise of billions of devices. Such a scenario can impact any externally connected networks and systems also. One study [
68] demonstrated that an experimental malware attack against Philips Hue smart lamp was so successful that it compromised all such lamps in the network, despite the presence of reliable cryptographic authentication mechanisms against malicious firmware updates. Similar attacks could provide the control of lights of an entire city or their use in DDoS against outside targets [
68].
Various types of sensors are an essential part of IoT systems like GPS, Radio-Frequency Identification (RFID), temperature gauge and IP cameras. This also includes sensors and actuators embedded in autonomous vehicles and the internet of vehicles (IOVs). These physical devices are vulnerable to physical attacks and manipulation by malicious actors. Another component of IoT systems susceptible to such physical attacks is the actuator part, which performs some function based on readings of sensor devices. Both actuators and sensors would be subjected to DoS attacks through flooding, eavesdropping, location tracking, cloning and spoofing attacks [
69,
70,
71].
An IoT system consists of several interconnected devices using either wireless or wired networks. A large network linked to devices would have weak security profiles, where sensors and actuators are vulnerable to a multitude of attacks. WSNs provide information to external entities without any restriction. When they are integrated with conventional networks services, they cause regression in the security of conventional networks [
72,
73].
4.4. Protocols Level Attacks
IoT systems are different from traditional Internet protocols, which require lightweight protocols to address issues of limited energy, data rate and computing power. A detailed description of IoT protocols based attacks can be found in [
74]. Attacks of IoT technologies are presented with threat types in
Table 3.
4.5. Radio-Frequency Identification (RFID)
Because the communication between the reader and RFID tags is made through an unprotected wireless channel, the transmitted data is exposed by unauthorized readers. RFID systems face different security threats as compared to the security threats encountered by traditional wireless systems [
75]. Various hacking techniques against RFID are discussed as follows:
Tag Disable. An attacker may remove the tag, delete the tag memory by sending a kill command, remove the antenna, give a high energy wave to a tag, and use a Faraday cage to block electromagnetic waves.
Tag Modification. An attacker modifies or deletes valuable data from the memory of the tag.
Cloning Tags. An attacker imitates or clones the tags after skimming the tag’s information.
Reverse Engineering. Using reverse engineering, an attacker can make a copy of a tag, and using tag examination, the attacker may get confidential data stored within a tag.
Eavesdropping. RFID systems working in ultra high frequency (UHF) are more vulnerable to this threat. An attacker gathers the information shared between a valid tag and valid reader.
Snooping. An attacker introduces an unauthorized reader to interact with the tag.
Skimming. An attacker snoops data shared between a legitimate reader and legitimate tag.
Replay Attack. An attacker spies to collect information about the IoT device or node replays eavesdropped information to achieve deception.
Relay Attacks. An attacker places an illegitimate device between the tag and the reader to intercept, modify and forward information directly to other systems.
Electromagnetic (EM) Interference. An attacker creates a signal in the same range as the reader to preclude tags from communicating with readers.
Fake RFID Tag Query. An attacker sends queries and gets the same response from a tag at various locations to determine the location of a specific tag.
Cryptograph Decipher Attack. An attacker decodes encryption algorithms by launching violent attacks and gets the plain text by deciphering the intercepted cryptography.
Blocker tag Attack. Using a blocker tag, an attacker attempts to restrict the reader from reading tags.
4.6. Zigbee Protocol
The Zigbee protocol is one of the most popular IoT protocols used for communication in IoT devices because of its low cost, low power consumption and scalability. While the importance of security was considered during the design of Zigbee, some trade-offs have been kept to bring the cost of devices down and make them scalable at a low cost. Some of the standard security measures could not be implemented which ultimately resulted in security vulnerabilities. The major security threats against Zigbee networks are enumerated below.
4.7. Wireless Fidelity (WiFi)
A detailed review of attacks against various versions of the 802.11 security mechanism (i.e., WPA, WPA2, WEP) is explained in [
80]. The most common WiFi attacks are described below.
Attacks Related to Retrieving Key. An attacker would monitor specific packets and then crack the key process offline. The common attacks in this category are Pyshkin, Tews, and Weinmann (PTW) attacks, Fluhrer, Mantin, and Shamir (FMS) attack, KoreK Family Attacks, Dictionary Attack and address resolution protocol (ARP) Injection [
80].
Attacks Related to Retrieving Keystream. An attacker only required to monitor for specific packets and then go on to perform the key cracking process offline. The common attacks in this category are PTW attacks, FMS attack, KoreK Family Attacks, Dictionary Attack and ARP Injection [
80].
DoS or Availability Attacks. This category of attacks includes those attacks that result in the unavailability of some service or network that is commonly called a DoS attack. These attacks usually target either a specific user or device, or try to exhaust network resources (e.g., the network router or Access Point), resulting in corrupting services for all users in that network. These attacks mostly depend on the broadcast of forged 802.11 management messages, which are easy to launch in versions of the WiFi standards up to 802.11n, as the management messages are transmitted unguarded [
81]. Attacks in this category include: Disassociation Attack, Block ACK flood, Authentication Request Flooding Attack, Deauthentication Broadcast Attack, Fake Power Saving Attack, Beacon Flooding Attack, Probe Request and Response Flooding Attacks. A survey of DoS attacks in 802.11 is covered in [
82].
4.8. Bluetooth
Most of the issues found in Bluetooth are related to the pairing process. Attacks can be launched during the pairing process stages, like before the completion of the pairing process and after the pairing of devices is completed [
83]. For instance, based on information collected after pairing, attackers can launch man-in-the-middle attacks. A review of Bluetooth security issues is explained in [
83,
84,
85]. The common attacks against Bluetooth are discussed below.
PIN Cracking Attack. This type of attack is performed during the pairing of the device and the process of authentication. An attacker collects the random number (RAND) and the Bluetooth Device Address (BD_ADDR) of the targeted device using some frequency sniffer tool. Then, a brute-force algorithm (for example, E22 algorithm) is applied to check all possible combinations of the PIN with the data collected earlier until the correct PIN is determined [
84].
MAC Spoofing Attack. An attack is launched during the process of link keys generation and before encryption is established. Devices manage to authenticate each other using generated link-keys. In this, attackers can imitate another user. Attackers can also dismiss connections or even alter data [
84].
Man-in-the-Middle (MIM) Attack. MIM attacks are launched when devices are trying to pair [
86]. After the attack is launched, devices share messages unknowingly [
58]. During this time authentication is performed without the shared secret keys [
58]. When the attack is successful, the two devices are paired to the attacker [
57,
58], while they believe the pairing was successful.
Bluebugging. An attacker exploits vulnerabilities of old devices firmware to spy on phone calls, send and receive messages, and connect to the Internet without legal users’ knowledge.
Bluesnarfing. An attacker gets unauthorized access to devices to retrieve information and redirect the incoming calls.
BluePrinting Attack. This attack is launched to capture the device model, manufacturer, and firmware version of the device. This attack will work only if the target device’s BD_ADDR is known.
Fuzzing Attack. In a fuzzing attack, a device is forced to behave abnormally by an attacker through sending malformed data packets to Bluetooth radio of the device.
Brute-Force BD_ADDR Attack. Since the first three bytes of BD_ADDR are fixed and known publicly, the brute-force attack is launched to scan on the last three bytes [
84].
Worm Attacks. In this attack, an attacker sends a malicious software or Trojan file to available vulnerable Bluetooth devices. Examples of these attacks are Sculls’ worm, Cabir worm and Lasco worm.
DoS attacks. These attacks target the physical layer or above layers in the protocol stack. Some typical DoS attacks are battery exhaustion, BlueChop, BD_ADDR duplication, BlueSmack, Big NAK (Negative Acknowledgement) and L2CAP guaranteed service.
4.9. Near Field Communication (NFC)
Although the communication range of NFC is restricted to a few centimeters, the International Organization for Standardization (ISO) standard does not guarantee secure communication. The common attacks against NFC technologies are briefly mentioned below [
87].
Eavesdropping. By using powerful and bigger antennas than those of mobile devices, NFC communications can be received or intercepted by an attacker in the vicinity of the devices. This allows an attacker to eavesdrop an NFC communication across larger distances.
Data Corruption. An attacker can modify data transmitted over an NFC interface. If the attacker alters the data into an unrecognized format, this may result in DoS attacks.
Data Modification. An attacker alters the actual data using amplitude modulations of data transmissions.
Data Insertion. Malicious and undesirable data can be inserted in the form of messages into the data during the data exchange between two devices.
NFC Data Exchange Format (NDEF) attacks. An attacker would exploit partial signatures, record composition attacks and establish trust [
88].
4.10. IEEE 802.15.4
IEEE 802.15.4 is a technical standard, used by several IoT protocols, which describes the operation of low-rate wireless personal area networks (LR-WPANs). It stipulates the PHY layer and MAC for LR-WPANs. The IoT protocols based on IEEE 802.15.4 include 6LowPAN, ZigBee, Wireless HART, ISA 100.11a, MiWi, Thread and SubNetwork Access Protocol (SNAP). These protocols extended the standard by developing the upper layers, which are not covered in IEEE 802.15.4. The common attack types related to the IEEE 802.15.4 standard are explained in [
89,
90,
91].
Radio interference Attack. An attacker transmits high transmission powered radio interference signals over all channels of the related frequency band.
Symbol Flipping/ Signal Overshadowing Attack. An attacker injects wrong data into a network by converting a legitimate data frame into an altered frame comprising information of the attacker’s choice.
Steganography Attack. Adversaries would use a hidden channel to exchange information about the launching of new attacks in the network.
Node-Specific Flooding. In this, the emission of packets is used to cause degradation throughput IoT networks by flooding massive fake data.
Back-Off Manipulation. An attacker transmits unnecessary packets to the victim and due to excessive packet reception, the targeted nodes’ power sources are ultimately exhausted.
Battery Life Extension (BLE) Pretense. An attacker transmits unnecessary packets to the victim and due to excessive packet reception, the targeted nodes’ power sources are ultimately exhausted.
Random Number Generator (RNG) Tampering. An attacker uses RNG in a way that guarantees that the back-off periods chosen by the adversary are much smaller than those selected by legitimate nodes.
Back-Off Countdown Omission. This type of attack implicates the complete exclusion of the random back-off countdown by a malicious attacker.
Clear Channel Assessment (CCA) Manipulation/ Reduction/Omission. An attacker gains channel access more frequently and quickly than it is done by legitimate network nodes.
Same-Nonce Attack. An attacker obtains ciphertext keys to gather valuable information about transmitted data.
Replay-Protection Attack. In this type of attack, frames with large sequence numbers are sent by attackers to targeted legitimate nodes. This results in dropping data frames with smaller sequence numbers from other legitimate nodes.
Acknowledgment (ACK) Attack. An attacker sends back a false ACK on behalf of the receiver with the correct expected sequence number to the sender. This prohibits data retransmission by misleading the sender into believing that the frame has been delivered to the receiver successfully [
89].
Guaranteed Time Slot (GTS) Attack. GTS attacks are initiated against the network by exploiting the GTS management scheme.
Personal Area Networks Identifier (PANId) Conflict Attack. An attacker can abuse the conflict resolution procedure by sending fake PANId conflict notifications to the targeted PAN coordinator to start conflict resolution, thus temporarily preventing or delaying communications between the PAN coordinator and member nodes.
Ping-Pong Effect Attack. This attack causes packet loss and service interruption, dropping node performance, and increasing consumption of energy and network load.
Bootstrapping Attack. An attacker forces a targeted network node to become unrelated with its PAN at a time of the attacker’s choosing by initiating any of the MAC or PHY layer attacks with the ultimate aim of causing DoS.
Steganography Attack. An attacker hides information within the MAC and PHY frame fields of the IEEE 802.15.4 protocol [
89]. Data can be hidden in IEEE 802.15.4 networks by using the PHY header field of PHY frames. Similarly, Steganography attacks would also be launched by hiding information within the MAC fields. Steganography attacks form a hidden channel between cooperating attackers in the network, which opens up a large number of prospects for adversaries.
4.11. Routing Protocol for Low Power and Lossy Network (RPL) Attack
The RPL protocol has been designed to allow point to point, multiple-point to point, and point to multiple-point communication. It is a distance-vector routing protocol based on IPv6. The RPL devices work on a specific topology that joins tree and mesh topologies called Destination Oriented Directed Acyclic Graphs (DODAG) [
74,
92]. Attacks against routing protocol can cause communication failures within IoT systems [
93]. The interconnection of IoT systems to the Internet multiplies the vulnerabilities exponentially through exposure to innumerable attack vectors. The main attacks against RPL are discussed as follows:
Sinkhole Attack. An attacker may announce a favorable route or falsified path to entice many nodes to redirect their packets through it.
Sybil Attack. An attacker may use different identities in the same network to overcome the redundancy techniques in scattered data storage. Also, this can be used to attack routing algorithms.
Wormhole Attack. An attacker disturbs both traffic and network topology. This attack can be launched by generating a private channel between two attackers in the network and transmitting the selected packets through it.
Blackhole Attack. An attacker maliciously advertises itself as the shortest path to the destination during the path-discovering mechanism and drops the data packets silently.
Selective Forward Attack. It is a variant of the Blackhole attack, where an attacker only rejects a specific subpart of the network traffic and forwards all RPL control packets. This attack is mainly targeted to disturb routing paths; however, it can also be used to filter any protocol [
74].
Hello flooding attack. An attacker can announce itself as a neighbor to many nodes, even the complete network by broadcasting a “HELLO” message with a strong powered antenna and a favorable routing metric. This is done by an attacker in order to deceive other objects to send their packet through it [
94].
4.12. Internet Protocol (IPv6) and Low-Power Wireless Personal Area Networks (6LoWPAN) Based Attacks
6LoWPAN was designed to meet the communication requirements of connecting resource- constrained, low-powered objects and IPv6 networks. To achieve this, 6LoWPAN uses fragmentation at the adaptation layer. The main attacks against 6LoWPAN are explained as follows:
Fragmentation Attack. IoT object communicating in IEEE 802.15.4 has a Maximum Transmission Unit (MTU) of 127 bytes, as opposed to in IPv6, which has a minimum MTU of 1280 bytes. This is done using a fragmentation mechanism. Since fragmentation is performed without using any type of authentication, an attacker can inject fragments among a fragmentation chain [
95].
Authentication Attack. In the absence of an authentication mechanism in 6LowPAN, any malicious object can join the network and get legitimate access [
92].
Confidentiality Attack. In the absence of an encryption technique in 6loWPAN, attacks affecting confidentiality, like eavesdropping, spoofing and Man in the Middle can be launched.
7. Deep Learning (DL) Techniques for IDSs
DL algorithms outperform ML algorithms in applications involving large datasets. DL becomes most relevant in IoT security applications as IoT environments are characterized by the production of vast amounts and a variety of data [
171]. Furthermore, DL is capable of the automatic modeling of complex feature sets from the sample data [
171]. Another advantage of DL algorithms is their ability to allow deep linking in IoT networks [
172]. This enables automatic interactions between IoT-based systems in the absence of human intervention [
171] to perform assigned collaborative functions.
Because of their ability to extract hierarchical feature representations in complex deep architecture, DL can be classified as a branch of ML algorithms that uses multiple non-linear layers of processing to extract feature sets. These feature sets are then used for abstraction and pattern detection after necessary transformations [
173]. As shown in
Figure 16, DL can be used in a generative mode with unsupervised learning, discriminative mode using supervised learning, or a hybrid approach by combining both modes.
In this section, various major DL based techniques used for designing an IDS are discussed.
Table 5 below summarizes research studies conducted to propose IDS using various DL-based methods. Details about each research work along with the DL technique is explained in respective sub-sections below.
7.1. Recurrent Neural Networks (RNNs)
RNN is a discriminative DL algorithm, which is best suited in environments where data is to be processed sequentially. Unlike other neural networks, its output is dependent on back-propagation instead of forward propagation [
173,
174,
175]. A temporal layer is incorporated in an RNN for analyzing data sequentially followed by learning about multi-dimensional differences in unrevealed units of recurrent components [
165]. Modifications to these unrevealed units are then made corresponding to data encountered by the neural network, causing continuous updates and the manifestation of the current state of the neural network.
The current unrevealed state of the neural network is processed by an RNN algorithm through the estimation of succeeding hidden states as triggering of a previously unrevealed state. A simple explanation of RNN functioning is described in
Figure 17. Here, outputs from neurons are sent back as feedback to the neurons of the previous layer. Because IoT environments are characterized by the generation of large amounts of sequential data like network traffic flows, RNNs become relevant in IoT security applications, especially network intrusion detection. Previous research [
176] has proposed the use of an RNN for network intrusion detection through analysis of network traffic behavior and reported obtaining useful results, particularly time series-based threats. Another recent research [
177] proposes an IDS that uses cascaded filtering stages in which deep multi-layered RNN are applied for each filter. RNNs are then trained to detect common attacks launched in IoT environments, like R2L, Dos, U2R and Probe.
Long short-term memory (LSTM) network architectures, which are a specialized form of RNN, have also been used in the designing of IDS. The main attribute of LSTM based RNNs is to persist information or cell state for later use in the network. This feature makes them appropriate for performing analysis of temporal data that changes over time. Thus, LSTM networks are preferred to solve problems related to anomaly detection in time-series sequence data. Various forms of RNN, including LSTM based RNNs, have been used for anomaly and intrusion detection in IoT networks by researchers in [
178,
179,
180,
181,
182,
183]. While RNNs have demonstrated promising results in predicting time series data, the detection of anomalous traffic using these predictions is still challenging.
7.2. Convolutional Neural Network (CNN)
CNN is also a discriminative DL algorithm, which was designed to minimize the number of data inputs required for a conventional artificial neural network (ANN) through the use of equivariant representation, sparse interaction and sharing of parameters [
184]. Thus CNN becomes more scalable and requires less time for training. There are three-layer types in a CNN, namely convolutional layer, pooling layer and activation unit, as shown in
Figure 18. The convolutional layers use various kernels for convoluting data inputs [
185]. The pooling layers downsize samples, thus minimizing the sizes of succeeding layers. It involves two techniques: Max pooling and average pooling, where the former chooses a maximum value for every cluster of past layers after distributing the input among distinctive clusters [
186,
187].
The average pooling, on the other hand, calculates the average values of every cluster in the previous layer. The activation unit is able to trigger an activation function on every feature in the feature set in a non-linear fashion [
187]. CNN is best suited for highly efficient and fast feature extraction from raw data but at the same time CNN requires high computational power [
188]. Hence using CNN on resource-constrained IoT devices for their security is highly challenging. This challenge is somewhat addressed through distributed architecture where a lighter version of Deep NN is trained and implemented on-board with only a subset of vital output classes, whereas, the high computational power of the cloud is used to perform the complete the training of the algorithm [
166]. Their use in IoT environment security was discussed in previous research published in [
189,
190] for malware detection. In [
40], authors propose a hybrid data processing model for network anomaly detection that utilizes Grey Wolf Optimization (GWO) and CNN techniques. Authors claim to have achieved better accuracy and detection rate in comparison to other state-of-the-art IDS.
7.3. Deep Autoencoders (AEs)
It is an unsupervised algorithm designed for the reproduction of its input at its output through the use of a decoder function and a hidden layer containing the definition of a code utilized for the representation of input [
184]. The other function in an AE neural network is called the encoder function and is responsible for the conversion of the acquired input into code. During training, reconstruction errors must be minimized [
191]. One use case for AE is feature extraction from the datasets. However, these suffer from the requirement of high computational power. Deep AEs have been used for the detection of network-based malware in previous research with better accuracy than SVM and KNN [
167]. Kitsune [
41] is one such study where an ensemble of deep auto-encoders was used to implement an online lightweight IDS for IoT environments based on unsupervised learning and anomaly detection where authors demonstrate better accuracy as compared to other ML and DL techniques.
7.4. Restricted Boltzmann Machine (RBM)
It is an unsupervised learning-based algorithm and builds a deep generative and undirected model [
168]. There are no two nodes in any layer of an RBM that have any connection with each other. Visible and hidden layers are the two types of layers making up an RBM. Known input parameters are contained in the visible layer, while the unknown potential variables are included with several layers forming the hidden layer. Working hierarchically, features extracted from a dataset are then passed on to the next layer as latent variables. RBMs were used in various research work [
192,
193] for network/IoT intrusion detection systems. The challenge of implementing RBMs is that it needs high computational resources while implementing it on low-powered IoT devices. Furthermore, Single RBM lacks the capability of feature representation. However, this limitation can be overcome by applying two or more RBM stacked to form a Deep Belief Network (DBN).
7.5. Deep Belief Network (DBN)
Being formed by stacking two or more RBMs, DBN can be considered as unsupervised learning based generative algorithms [
194]. They perform robustly through unsupervised training for each layer separately [
165]. Initial features are extracted in the pre-training phase for each layer, followed by a fine-tuning phase where the application of a softmax layer is executed on the top layer [
170]. It is mainly composed of two layers, i.e., visible layer and hidden layer, as shown in
Figure 19. Though the study in [
188,
195] discussed malicious attack detection using DBNs with comparatively better results than ML algorithms, no evidence of applicability in the IoT environment was reported in the literature.
7.6. Generative Adversarial Network (GAN)
It is a hybrid DL method that uses both generative and discriminative models at the same time for training [
196]. Distributions of the dataset and samples is obtained by the generative model predictions about the authentic origination of a given sample from a training dataset and are made by the discriminative model [
196]. As shown in
Figure 20, both generative and discriminative models work as adversaries where the generative model attempts deception through the generation of a sample using random noise. On the other hand, the discriminative model attempts to authenticate real training data samples from deceptive samples generated by the generative model. Here,
D(x) represents a binary classification giving output as real or fake (generated). The measure of correct/incorrect classification determines the accuracy and performance of both the models in an inversely proportional fashion. This results in models updating in each iteration [
191]. The study published in [
169] discussed the utility of the GAN algorithm for detecting anomalous behavior in IoT environments with promising results due to their ability to counter zero-day attacks through the generation of samples mimicking zero-day attacks, thereby causing the discriminator to learn different attack scenarios. However, the challenge with using GAN is that its training is difficult and it produces unstable results [
196,
197].
7.7. Ensemble of DL Networks (EDLNs)
As discussed earlier, the ensemble of various ML classifiers proves more effective than individual ML classifier results. Similarly multiple DL algorithms can be used in parallel through organizing in an ensemble to produce better results than each component DL algorithm. EDLNs can have any combination of a discriminative, generative, or hybrid type of DL algorithms. Best suited for solving complex issues, EDLNs perform better in uncertain environments with a high number of features. A heterogeneous EDLN has classifiers from the different genres, whereas a homogeneous EDLN has classifiers from the same genre. Both compositions are aimed at increasing efficiency and producing accurate results [
198]. Application of EDLN for IoT security requires further study and research, to evaluate the possibility of improving the performance and accuracy of the IoT security system [
12].
Table 6 illustrates common attack types handled by corresponding DL methods along with reference to related research.
Table 6 also describes advantages and limitations of each suggested DL method. Later,
Table 5 below covers the comparison of work conducted on ML and DL techniques on IoT Security.
9. Challenges and Future Research Directions
A large number of studies and research works have been published related to IDSs for IoT. However, there are still a large number of open research challenges and issues, particularly in the use of ML and DL techniques for anomaly and intrusion detection in IoT. The challenge is that there exists no standard mechanism that guarantees validation of the proposed systems or method. The research works mostly demonstrate evaluation of their proposed systems based on synthesized datasets and address one specific problem which may not work in the real world on real data and in the presence of other problems. As evident from this and other similar studies conducted on state of the art in IDS for IoT, it is very difficult to design an IDS which covers, at least, the most important aspects of an effective IDS, that is it is deployable, online, scalable, works effectively on real data and satisfies all stakeholders requirements. Instead, most of the published work share evaluation results tested on contrived datasets, cover a single or some part of the system, and show results using biased parameters.
Furthermore, a proof of completeness and accuracy of any proposed IDS is very hard to define or accomplish. Thus, one of the conclusions from this study is that it is very hard to design a comprehensive IDS, which can offer good accuracy, scalability, robustness and protection against all types of threats. Below, some of the major issues and challenges that researchers face today and in the future are described. Since the IoT security measures are still not matured, there is enormous scope for future research in this area, particularly in anomaly and intrusion detection using ML and DL techniques.
The most recent challenges related to anomaly and intrusion detection in IoT networks are discussed in the following:
To test and validate proposed NIDS, a good quality dataset related to IoT IDS is very essential. Such a dataset should possess a reasonable size of network flow data covering both attack and normal behavior with the corresponding label. Furthermore, in order to capture normal behavior, normal traffic data from each type of IoT device is required, other than the attack data for testing the NIDS. However, as discussed in the previous section, most of the publicly available datasets lack in providing the required features, like missing labels, incomplete network features, missing raw pcap files and are difficult to comprehend and/or have incomplete CSV files. Moreover, datasets available only capture normal behavior of a specific type of IoT devices, which restricts training of IDS on those devices only. Creating a dataset that can address these issues in a real environment will be a challenge and a potential area of research.
Developing an online and real-time, anomaly-based IDS for IoT networks is very challenging. This is because such an IDS would require to learn a normal behavior first to detect abnormal or malicious behavior. The learning phase assumes that there is no noise or attack traffic during this period which cannot be guaranteed. Such an IDS may generate false alarms if these issues are not addressed.
As also described in this paper, most of the anomaly-based NIDS tries to construct a model that captures the profile of all possible behavior or patterns of normal traffic. This, however, is extremely challenging because it has been proven that such models tend to bias towards the dominated class, that is, normal class, resulting in high false-positive rates. Furthermore, it is also not possible to capture all possible normal observations that may be generated in a network, particularly in a heterogeneous environment of IoT networks, which increases false-negative rates. Completely avoiding or minimizing false-positive rates and false-negative rates in NIDS is another research challenge.
It would be interesting to develop models trained on specific types of devices. These models can be applied to IDSs in other organizations using a similar type of device. This will assist other organizations, which can deploy these models and thus save time that would have been required to collect the data and train the IDSs. It will also help in detecting malicious IoT devices, which are already compromised because their behavior would be different from normal behavior captured by trained models. Developing such models is a challenging task and a potential area for future research.
Different stages involved in the design and implementation of NIDS, like data-preprocessing and feature reduction, model training and deployment, in particular, ML and DL based NIDS, increase computational complexity. Thus designing an efficient NIDS that is light on computational requirements is another challenge and area for future research.
Feature selection and dimensionality reduction methods used for proposed IDSs are suitable to work on a specific type of normal traffic and to detect a particular type of attacks which may not work once the environment of normal or attack sequences change a bit, especially under a fast-changing environment of IoT devices and networks. Thus, dynamic and computationally efficient mechanism for feature selection which can work under all types of normal and attack traffic is a potential research challenge.
DL and ML-based techniques and algorithms are being widely used for training a model on a large dataset. This has facilitated in effective handling of cyber-attacks. However, with regards to the use of DL and ML algorithms for attack detection in IoT networks, some challenges need the attention of researchers; for example, resource constraints issue with IoT devices limits the use of DL/ML algorithms [
163] for protection of IoT networks. Another challenge with the use of ML/DL techniques in large and distributed networks, like that of IoT networks, is that they face scalability issues, for example in terms of various scenarios and choices of IDS deployment. One possible solution to limitations of individual DL or ML algorithms suggested by some of the authors [
211] is the use of an ensemble of ML/DL algorithms that performed better in comparison to an individual ML algorithm; however, such algorithms were computationally expensive and thus resulted in network latency issues, which cannot be afforded in critical systems involving risks to human lives, like health and autonomous or internet of vehicles (IoVs) systems.
The techniques of semi-supervised learning, transfer learning and reinforcement learning (RL) are still not well explored and experimented for designing an IDS for IoT security in order to achieve important objectives like real-time, fast training and unified models for anomaly detection in IoT and thus are potential areas of future research. Moreover, it would be an interesting research area to use RL in combination with DL because their combined use can be beneficial in IoT network scenarios involving large data dimensionality and non-stationary environments.
10. Conclusions
During the last decade, the use of IoT devices has increased exponentially in all walks of life due to its capacity of converting objects from different application areas into Internet hosts. At the same time, users’ privacy and security are threatened due to IoT security vulnerabilities. Therefore, there is a requirement to develop more robust security solutions for IoT. Machine and deep learning-based IDS is one of the key techniques for IoT security. In this work, a survey of ML and DL based Intrusion Detection techniques used in IDS for IoT networks and systems is presented. The IoT architecture, protocols, IoT systems vulnerabilities, and IoT protocol-level attacks have been discussed in detail. Then, this paper surveyed various research work available in the literature, which suggested IDS methodology for IoT or proposed attack detection techniques for IoT that could be part of an IDS, specifically about various ML and DL techniques available for IDS in IoT and their use by the researchers. Also, a review of various datasets available for IoT security-related research is elaborated. This work attempts to provide the researchers with the summarized but comprehensive and useful insight into the various security challenges currently being faced by IoT systems and networks and possible solutions, with a focus on intrusion detection, based on ML and DL based methods.