Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

Transtab: Learning transferable tabular transformers across tables

Z Wang, J Sun - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc
Tabular data (or tables) are the most widely used data format in machine learning (ML).
However, ML models often assume the table structure keeps fixed in training and testing …

Turl: Table understanding through representation learning

X Deng, H Sun, A Lees, Y Wu, C Yu - ACM SIGMOD Record, 2022 - dl.acm.org
Relational tables on the Web store a vast amount of knowledge. Owing to the wealth of such
tables, there has been tremendous progress on a variety of tasks in the area of table …

Transformers for tabular data representation: A tutorial on models and applications

G Badaro, P Papotti - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
In the last few years, the natural language processing community witnessed advances in
neural representations of free texts with transformer-based language models (LMs). Given …

Transformers for tabular data representation: A survey of models and applications

G Badaro, M Saeed, P Papotti - Transactions of the Association for …, 2023 - direct.mit.edu
In the last few years, the natural language processing community has witnessed advances
in neural representations of free texts with transformer-based language models (LMs). Given …

Hitab: A hierarchical table dataset for question answering and natural language generation

Z Cheng, H Dong, Z Wang, R Jia, J Guo, Y Gao… - arXiv preprint arXiv …, 2021 - arxiv.org
Tables are often created with hierarchies, but existing works on table reasoning mainly focus
on flat tables and neglect hierarchical tables. Hierarchical tables challenge existing methods …

HYTREL: Hypergraph-enhanced tabular data representation learning

P Chen, S Sarkar, L Lausen… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Language models pretrained on large collections of tabular data have
demonstrated their effectiveness in several downstream tasks. However, many of these …

Robust (controlled) table-to-text generation with structure-aware equivariance learning

F Wang, Z Xu, P Szekely, M Chen - arXiv preprint arXiv:2205.03972, 2022 - arxiv.org
Controlled table-to-text generation seeks to generate natural language descriptions for
highlighted subparts of a table. Previous SOTA systems still employ a sequence-to …

Strubert: Structure-aware bert for table search and matching

M Trabelsi, Z Chen, S Zhang, BD Davison… - Proceedings of the ACM …, 2022 - dl.acm.org
A table is composed of data values that are organized in rows and columns providing
implicit structural information. A table is usually accompanied by secondary information such …

Deepjoin: Joinable table discovery with pre-trained language models

Y Dong, C Xiao, T Nozawa, M Enomoto… - arXiv preprint arXiv …, 2022 - arxiv.org
Due to the usefulness in data enrichment for data analysis tasks, joinable table discovery
has become an important operation in data lake management. Existing approaches target …