Non-IID data re-balancing at IoT edge with peer-to-peer federated learning for anomaly detection

H Wang, L Muñoz-González, D Eklund… - Proceedings of the 14th …, 2021 - dl.acm.org
Proceedings of the 14th ACM conference on security and privacy in wireless …, 2021dl.acm.org
The increase of the computational power in edge devices has enabled the penetration of
distributed machine learning technologies such as federated learning, which allows to build
collaborative models performing the training locally in the edge devices, improving the
efficiency and the privacy for training of machine learning models, as the data remains in the
edge devices. However, in some IoT networks the connectivity between devices and system
components can be limited, which prevents the use of federated learning, as it requires a …
The increase of the computational power in edge devices has enabled the penetration of distributed machine learning technologies such as federated learning, which allows to build collaborative models performing the training locally in the edge devices, improving the efficiency and the privacy for training of machine learning models, as the data remains in the edge devices. However, in some IoT networks the connectivity between devices and system components can be limited, which prevents the use of federated learning, as it requires a central node to orchestrate the training of the model. To sidestep this, peer-to-peer learning appears as a promising solution, as it does not require such an orchestrator. On the other side, the security challenges in IoT deployments have fostered the use of machine learning for attack and anomaly detection. In these problems, under supervised learning approaches, the training datasets are typically imbalanced, i.e. the number of anomalies is very small compared to the number of benign data points, which requires the use of re-balancing techniques to improve the algorithms' performance. In this paper, we propose a novel peer-to-peer algorithm,P2PK-SMOTE, to train supervised anomaly detection machine learning models in non-IID scenarios, including mechanisms to locally re-balance the training datasets via synthetic generation of data points from the minority class. To improve the performance in non-IID scenarios, we also include a mechanism for sharing a small fraction of synthetic data from the minority class across devices, aiming to reduce the risk of data de-identification. Our experimental evaluation in real datasets for IoT anomaly detection across a different set of scenarios validates the benefits of our proposed approach.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果