[HTML][HTML] An insight into imbalanced big data classification: outcomes and challenges

A Fernández, S del Río, NV Chawla… - Complex & Intelligent …, 2017 - Springer
Complex & Intelligent Systems, 2017Springer
Big Data applications are emerging during the last years, and researchers from many
disciplines are aware of the high advantages related to the knowledge extraction from this
type of problem. However, traditional learning approaches cannot be directly applied due to
scalability issues. To overcome this issue, the MapReduce framework has arisen as a “de
facto” solution. Basically, it carries out a “divide-and-conquer” distributed procedure in a fault-
tolerant way to adapt for commodity hardware. Being still a recent discipline, few research …
Abstract
Big Data applications are emerging during the last years, and researchers from many disciplines are aware of the high advantages related to the knowledge extraction from this type of problem. However, traditional learning approaches cannot be directly applied due to scalability issues. To overcome this issue, the MapReduce framework has arisen as a “de facto” solution. Basically, it carries out a “divide-and-conquer” distributed procedure in a fault-tolerant way to adapt for commodity hardware. Being still a recent discipline, few research has been conducted on imbalanced classification for Big Data. The reasons behind this are mainly the difficulties in adapting standard techniques to the MapReduce programming style. Additionally, inner problems of imbalanced data, namely lack of data and small disjuncts, are accentuated during the data partitioning to fit the MapReduce programming style. This paper is designed under three main pillars. First, to present the first outcomes for imbalanced classification in Big Data problems, introducing the current research state of this area. Second, to analyze the behavior of standard pre-processing techniques in this particular framework. Finally, taking into account the experimental results obtained throughout this work, we will carry out a discussion on the challenges and future directions for the topic.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果