Bofrf: A novel boosting-based federated random forest algorithm on horizontally partitioned data

M Gencturk, AA Sinaci, NK Cicekli - IEEE Access, 2022 - ieeexplore.ieee.org
IEEE Access, 2022ieeexplore.ieee.org
The application of federated learning on ensemble methods is a common practice with the
goal of increasing the predictive power of local models. However, although existing
federated solutions utilizing ensemble methods can achieve this when the datasets of sites
are balanced and of good quality, ie, the local models are already above a certain accuracy
threshold, they usually fail to provide the same level of improvement to the models of sites
that have an unsuccessful classifier because of their poor quality or imbalanced data. To …
The application of federated learning on ensemble methods is a common practice with the goal of increasing the predictive power of local models. However, although existing federated solutions utilizing ensemble methods can achieve this when the datasets of sites are balanced and of good quality, i.e., the local models are already above a certain accuracy threshold, they usually fail to provide the same level of improvement to the models of sites that have an unsuccessful classifier because of their poor quality or imbalanced data. To address this challenge, we propose a novel federated ensemble classification algorithm for horizontally partitioned data, namely Boosting-based Federated Random Forest (BOFRF), which not only increases the predictive power of all participating sites, but also provides significantly high improvement on the predictive power of sites having unsuccessful local models. We implement a federated version of random forest, which is a well-known bagging algorithm, by adapting the idea of boosting to it. We introduce a novel aggregation and weight calculation methodology that assigns weights to local classifiers based on their classification performance at each site without increasing the communication or computation cost. We evaluate the performance of our proposed algorithm in different federated environments that we set up by using four healthcare datasets. The empirical results show that BOFRF improves the predictive power of local random forest models in all cases. The advantage of BOFRF is that the level of improvement it provides for sites having unsuccessful local models is significantly high unlike existing solutions.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果