Reviewing autoencoders for missing data imputation: Technical trends, applications and outcomes

RC Pereira, MS Santos, PP Rodrigues… - Journal of Artificial …, 2020 - jair.org
Missing data is a problem often found in real-world datasets and it can degrade the
performance of most machine learning models. Several deep learning techniques have …

Systematic review of using machine learning in imputing missing values

M Alabadla, F Sidi, I Ishak, H Ibrahim… - IEEE …, 2022 - ieeexplore.ieee.org
Missing data are a universal data quality problem in many domains, leading to misleading
analysis and inaccurate decisions. Much research has been done to investigate the different …

Generating synthetic missing data: A review by missing mechanism

MS Santos, RC Pereira, AF Costa, JP Soares… - IEEE …, 2019 - ieeexplore.ieee.org
The performance evaluation of imputation algorithms often involves the generation of
missing values. Missing values can be inserted in only one feature (univariate configuration) …

ydata-profiling: Accelerating data-centric AI with high-quality data

F Clemente, GM Ribeiro, A Quemy, MS Santos… - Neurocomputing, 2023 - Elsevier
Abstract ydata-profiling is an open-source Python package for advanced exploratory data
analysis that enables users to generate data profiling reports in a simple, fast, and efficient …

[PDF][PDF] Predicting cervical cancer using machine learning methods

R Alsmariy, G Healy… - International Journal of …, 2020 - pdfs.semanticscholar.org
In almost all countries, precautionary measures are less expensive than medical treatment.
The early detection of any disease gives a patient better chances of successful treatment …

The impact of heterogeneous distance functions on missing data imputation and classification performance

MS Santos, PH Abreu, A Fernández, J Luengo… - … Applications of Artificial …, 2022 - Elsevier
This work performs an in-depth study of the impact of distance functions on K-Nearest
Neighbours imputation of heterogeneous datasets. Missing data is generated at several …

Missing data imputation via denoising autoencoders: the untold story

AF Costa, MS Santos, JP Soares, PH Abreu - Advances in Intelligent Data …, 2018 - Springer
Missing data consists in the lack of information in a dataset and since it directly influences
classification performance, neglecting it is not a valid option. Over the years, several studies …

How distance metrics influence missing data imputation with k-nearest neighbours

MS Santos, PH Abreu, S Wilk, J Santos - Pattern Recognition Letters, 2020 - Elsevier
In missing data contexts, k-nearest neighbours imputation has proven beneficial since it
takes advantage of the similarity between patterns to replace missing values. When dealing …

A data-driven missing value imputation approach for longitudinal datasets

C Ribeiro, AA Freitas - Artificial Intelligence Review, 2021 - Springer
Longitudinal datasets of human ageing studies usually have a high volume of missing data,
and one way to handle missing values in a dataset is to replace them with estimations …

Relation-aware shared representation learning for cancer prognosis analysis with auxiliary clinical variables and incomplete multi-modality data

Z Ning, D Du, C Tu, Q Feng… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
The integrative analysis of complementary phenotype information contained in multi-
modality data (eg, histopathological images and genomic data) has advanced the …