A survey on self-supervised learning for non-sequential tabular data

WY Wang, WW Du, D Xu, W Wang, WC Peng - Machine Learning, 2025 - Springer
Self-supervised learning (SSL) has been incorporated into many state-of-the-art models in
various domains, where SSL defines pretext tasks based on unlabeled datasets to learn …

Morphological prototyping for unsupervised slide representation learning in computational pathology

AH Song, RJ Chen, T Ding… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Representation learning of pathology whole-slide images (WSIs) has been has
primarily relied on weak supervision with Multiple Instance Learning (MIL). However the …

Distribution alignment optimization through neural collapse for long-tailed classification

J Gao, H Zhao, D dan Guo, H Zha - Forty-first International …, 2024 - openreview.net
A well-trained deep neural network on balanced datasets usually exhibits the Neural
Collapse (NC) phenomenon, which is an informative indicator of the model achieving good …

A closer look at deep learning on tabular data

HJ Ye, SY Liu, HR Cai, QL Zhou, DC Zhan - arXiv preprint arXiv …, 2024 - arxiv.org
Tabular data is prevalent across various domains in machine learning. Although Deep
Neural Network (DNN)-based methods have shown promising performance comparable to …

Modern neighborhood components analysis: A deep tabular baseline two decades later

HJ Ye, HH Yin, DC Zhan - arXiv preprint arXiv:2407.03257, 2024 - arxiv.org
The growing success of deep learning in various domains has prompted investigations into
its application to tabular data, where deep models have shown promising results compared …

Boosted multilayer feedforward neural network with multiple output layers

H Aly, AK Al-Ali, PN Suganthan - Pattern Recognition, 2024 - Elsevier
This research introduces the Boosted Ensemble deep Multi-Layer Layer Perceptron
(EdMLP) architecture with multiple output layers, a novel enhancement for the traditional …

TALENT: A Tabular Analytics and Learning Toolbox

SY Liu, HR Cai, QL Zhou, HJ Ye - arXiv preprint arXiv:2407.04057, 2024 - arxiv.org
Tabular data is one of the most common data sources in machine learning. Although a wide
range of classical methods demonstrate practical utilities in this field, deep learning methods …

T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data

H Thimonier, JLDM Costa, F Popineau… - arXiv preprint arXiv …, 2024 - arxiv.org
Self-supervision is often used for pre-training to foster performance on a downstream task by
constructing meaningful representations of samples. Self-supervised learning (SSL) …

Manifoldron: Direct space partition via manifold discovery

D Wang, FL Fan, BJ Hou, H Zhang, Z Jia… - … on Neural Networks …, 2024 - ieeexplore.ieee.org
A neural network (NN) with the widely-used ReLU activation has been shown to partition the
sample space into many convex polytopes for prediction. However, the parametric way a NN …

Scalable Representation Learning for Multimodal Tabular Transactions

N Raman, S Ganesh, M Veloso - arXiv preprint arXiv:2410.07851, 2024 - arxiv.org
Large language models (LLMs) are primarily designed to understand unstructured text.
When directly applied to structured formats such as tabular data, they may struggle to …