A survey on addressing high-class imbalance in big data

JL Leevy, TM Khoshgoftaar, RA Bauder, N Seliya - Journal of Big Data, 2018 - Springer
In a majority–minority classification problem, class imbalance in the dataset (s) can
dramatically skew the performance of classifiers, introducing a prediction bias for the …

Big data preprocessing: methods and prospects

S García, S Ramírez-Gallego, J Luengo, JM Benítez… - Big data analytics, 2016 - Springer
The massive growth in the scale of data has been observed in recent years being a key
factor of the Big Data scenario. Big Data can be defined as high volume, velocity and variety …

Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce

S Ramírez-Gallego, A Fernández, S García, M Chen… - Information …, 2018 - Elsevier
We live in a world were data are generated from a myriad of sources, and it is really cheap to
collect and storage such data. However, the real benefit is not related to the data itself, but …

An insight into imbalanced big data classification: outcomes and challenges

A Fernández, S del Río, NV Chawla… - Complex & Intelligent …, 2017 - Springer
Big Data applications are emerging during the last years, and researchers from many
disciplines are aware of the high advantages related to the knowledge extraction from this …

Adaptive machine learning algorithm and analytics of big genomic data for gene prediction

OA Sarumi, CK Leung - Tracking and preventing diseases with artificial …, 2022 - Springer
Artificial intelligence helps in tracking and preventing diseases. For instance, machine
learning algorithms can analyze big genomic data and predict genes, which helps …

A literature survey on various aspect of class imbalance problem in data mining

S Goswami, AK Singh - Multimedia Tools and Applications, 2024 - Springer
Data has become much widely available in recent years. Since the past years, Learning
classifiers from unbalanced data is a crucial issue that comes up frequently in classification …

[PDF][PDF] Big Data: Preprocesamiento y calidad de datos

F Herrera - novática, 2016 - 150.214.190.154
En los últimos años, el crecimiento masivo en la escala de los datos está siendo un factor
clave en el actual escenario de procesamiento de datos. La eficacia de los algoritmos de …

MEFASD-BD: multi-objective evolutionary fuzzy algorithm for subgroup discovery in big data environments-a mapreduce solution

F Pulgar-Rubio, AJ Rivera-Rivas… - Knowledge-Based …, 2017 - Elsevier
Nowadays, there is an incredible increase of data volumes around the world, with the
Internet as one of the main actors in this scenario and a growth rate above 30GB/s. The …

Emerging trend of big data analytics in bioinformatics: a literature review

K Nagaraj, GS Sharvani… - International Journal of …, 2018 - inderscienceonline.com
Advancement of unparalleled data in bioinformatics over the years is a major concern for
storage and management. Such massive data must be handled efficiently to disseminate …

Graph theory-based sequence descriptors as remote homology predictors

G Agüero-Chapin, D Galpert, R Molina-Ruiz… - Biomolecules, 2019 - mdpi.com
Alignment-free (AF) methodologies have increased in popularity in the last decades as
alternative tools to alignment-based (AB) algorithms for performing comparative sequence …