From tabular data to knowledge graphs: A survey of semantic table interpretation tasks and methods

J Liu, Y Chabot, R Troncy, VP Huynh, T Labbé… - Journal of Web …, 2023 - Elsevier
Tabular data often refers to data that is organized in a table with rows and columns. We
observe that this data format is widely used on the Web and within enterprise data …

Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

TAPEX: Table pre-training via learning a neural SQL executor

Q Liu, B Chen, J Guo, M Ziyadi, Z Lin, W Chen… - arXiv preprint arXiv …, 2021 - arxiv.org
Recent progress in language model pre-training has achieved a great success via
leveraging large-scale unstructured textual data. However, it is still a challenge to apply pre …

A hierarchical spatial transformer for massive point samples in continuous space

W He, Z Jiang, T Xiao, Z Xu, S Chen… - Advances in neural …, 2023 - proceedings.neurips.cc
Transformers are widely used deep learning architectures. Existing transformers are mostly
designed for sequences (texts or time series), images or videos, and graphs. This paper …

MultiHiertt: Numerical reasoning over multi hierarchical tabular and textual data

Y Zhao, Y Li, C Li, R Zhang - arXiv preprint arXiv:2206.01347, 2022 - arxiv.org
Numerical reasoning over hybrid data containing both textual and tabular content (eg,
financial reports) has recently attracted much attention in the NLP community. However …

Annotating columns with pre-trained language models

Y Suhara, J Li, Y Li, D Zhang, Ç Demiralp… - Proceedings of the …, 2022 - dl.acm.org
Inferring meta information about tables, such as column headers or relationships between
columns, is an active research topic in data management as we find many tables are …

Transformers for tabular data representation: A survey of models and applications

G Badaro, M Saeed, P Papotti - Transactions of the Association for …, 2023 - direct.mit.edu
In the last few years, the natural language processing community has witnessed advances
in neural representations of free texts with transformer-based language models (LMs). Given …

OmniTab: Pretraining with natural and synthetic data for few-shot table-based question answering

Z Jiang, Y Mao, P He, G Neubig, W Chen - arXiv preprint arXiv …, 2022 - arxiv.org
The information in tables can be an important complement to text, making table-based
question answering (QA) systems of great value. The intrinsic complexity of handling tables …

Chain-of-table: Evolving tables in the reasoning chain for table understanding

Z Wang, H Zhang, CL Li, JM Eisenschlos… - arXiv preprint arXiv …, 2024 - arxiv.org
Table-based reasoning with large language models (LLMs) is a promising direction to tackle
many table understanding tasks, such as table-based question answering and fact …

Tablellama: Towards open large generalist models for tables

T Zhang, X Yue, Y Li, H Sun - arXiv preprint arXiv:2311.09206, 2023 - arxiv.org
Semi-structured tables are ubiquitous. There has been a variety of tasks that aim to
automatically interpret, augment, and query tables. Current methods often require …