Transforming big data into smart data: An insight on the use of the k‐nearest neighbors algorithm to obtain quality data

I Triguero, D García‐Gil, J Maillo… - … : Data Mining and …, 2019 - Wiley Online Library
The k‐nearest neighbors algorithm is characterized as a simple yet effective data mining
technique. The main drawback of this technique appears when massive amounts of data …

Machine Learning in Tourism: A Brief Overview: Generation of Knowledge from Experience

R Egger - Applied data science in tourism: Interdisciplinary …, 2022 - Springer
In the last few years, a large hype around the topic of Machine Learning (ML) has emerged;
this can be justified thanks to various ML approaches having recently undergone rapid …

A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors

J Li, Q Zhu, Q Wu, Z Fan - Information Sciences, 2021 - Elsevier
Developing techniques for the machine learning of a classifier from class-imbalanced data
presents an important challenge. Among the existing methods for addressing this problem …

Classification in the presence of label noise: a survey

B Frénay, M Verleysen - IEEE transactions on neural networks …, 2013 - ieeexplore.ieee.org
Label noise is an important issue in classification, with many potential negative
consequences. For example, the accuracy of predictions may decrease, whereas the …

Tutorial on practical tips of the most influential data preprocessing algorithms in data mining

S García, J Luengo, F Herrera - Knowledge-Based Systems, 2016 - Elsevier
Data preprocessing is a major and essential stage whose main goal is to obtain final data
sets that can be considered correct and useful for further data mining algorithms. This paper …

Prototype selection for nearest neighbor classification: Taxonomy and empirical study

S Garcia, J Derrac, J Cano… - IEEE transactions on …, 2012 - ieeexplore.ieee.org
The nearest neighbor classifier is one of the most used and well-known techniques for
performing recognition tasks. It has also demonstrated itself to be one of the most useful …

Explaining the success of nearest neighbor methods in prediction

GH Chen, D Shah - Foundations and Trends® in Machine …, 2018 - nowpublishers.com
Many modern methods for prediction leverage nearest neighbor search to find past training
examples most similar to a test example, an idea that dates back in text to at least the 11th …

MRPR: A MapReduce solution for prototype reduction in big data classification

I Triguero, D Peralta, J Bacardit, S García, F Herrera - neurocomputing, 2015 - Elsevier
In the era of big data, analyzing and extracting knowledge from large-scale data sets is a
very interesting and challenging task. The application of standard data mining tools in such …

Computer‐assisted keyword and document set discovery from unstructured text

G King, P Lam, ME Roberts - American Journal of Political …, 2017 - Wiley Online Library
The (unheralded) first step in many applications of automated text analysis involves
selecting keywords to choose documents from a large text corpus for further study. Although …

SMOTE-NaN-DE: Addressing the noisy and borderline examples problem in imbalanced classification by natural neighbors and differential evolution

J Li, Q Zhu, Q Wu, Z Zhang, Y Gong, Z He… - Knowledge-Based …, 2021 - Elsevier
Learning a classifier from class-imbalance data is an important challenge. Among existing
solutions, SMOTE is one of the most successful methods and has an extensive range of …