Web table extraction, retrieval, and augmentation: A survey

S Zhang, K Balog - ACM Transactions on Intelligent Systems and …, 2020 - dl.acm.org
Tables are powerful and popular tools for organizing and manipulating data. A vast number
of tables can be found on the Web, which represent a valuable knowledge resource. The …

Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

Tuta: Tree-based transformers for generally structured table pre-training

Z Wang, H Dong, R Jia, J Li, Z Fu, S Han… - Proceedings of the 27th …, 2021 - dl.acm.org
We propose TUTA, a unified pre-training architecture for understanding generally structured
tables. Noticing that understanding a table requires spatial, hierarchical, and semantic …

Large language models for tabular data: Progresses and future directions

H Dong, Z Wang - Proceedings of the 47th International ACM SIGIR …, 2024 - dl.acm.org
Tables contain a significant portion of the world's structured information. The ability to
efficiently and accurately understand, process, reason about, analyze, and generate tabular …

Turning tables: Generating examples from semi-structured tables for endowing language models with reasoning skills

O Yoran, A Talmor, J Berant - arXiv preprint arXiv:2107.07261, 2021 - arxiv.org
Models pre-trained with a language modeling objective possess ample world knowledge
and language skills, but are known to struggle in tasks that require reasoning. In this work …

Explain and predict, and then predict again

Z Zhang, K Rudra, A Anand - Proceedings of the 14th ACM international …, 2021 - dl.acm.org
A desirable property of learning systems is to be both effective and interpretable. Towards
this goal, recent models have been proposed that first generate an extractive explanation …

Web table retrieval using multimodal deep learning

R Shraga, H Roitman, G Feigenblat… - Proceedings of the 43rd …, 2020 - dl.acm.org
We address the web table retrieval task, aiming to retrieve and rank web tables as whole
answers to a given information need. To this end, we formally define web tables as …

Entrant: A large financial dataset for table understanding

E Zavitsanos, D Mavroeidis, E Spyropoulou… - Scientific Data, 2024 - nature.com
Tabular data is a way to structure, organize, and present information conveniently and
effectively. Real-world tables present data in two dimensions by arranging cells in matrices …

Tcn: Table convolutional network for web table interpretation

D Wang, P Shiralkar, C Lockard, B Huang… - Proceedings of the Web …, 2021 - dl.acm.org
Information extraction from semi-structured webpages provides valuable long-tailed facts for
augmenting knowledge graph. Relational Web tables are a critical component containing …

Oct-gan: Neural ode-based conditional tabular gans

J Kim, J Jeon, J Lee, J Hyeong, N Park - Proceedings of the Web …, 2021 - dl.acm.org
Synthesizing tabular data is attracting much attention these days for various purposes. With
sophisticate synthetic data, for instance, one can augment its training data. For the past …