Table understanding: Problem overview

A Shigarov - Wiley Interdisciplinary Reviews: Data Mining and …, 2023 - Wiley Online Library
Tables are probably the most natural way to represent relational data in various media and
formats. They store a large number of valuable facts that could be utilized for question …

Annotating columns with pre-trained language models

Y Suhara, J Li, Y Li, D Zhang, Ç Demiralp… - Proceedings of the …, 2022 - dl.acm.org
Inferring meta information about tables, such as column headers or relationships between
columns, is an active research topic in data management as we find many tables are …

Sato: Contextual semantic type detection in tables

D Zhang, Y Suhara, J Li, M Hulsebos… - arXiv preprint arXiv …, 2019 - arxiv.org
Detecting the semantic types of data columns in relational tables is important for various
data preparation and information retrieval tasks such as data cleaning, schema matching …

Santos: Relationship-based semantic table union search

A Khatiwada, G Fan, R Shraga, Z Chen… - Proceedings of the …, 2023 - dl.acm.org
Existing techniques for unionable table search define unionability using metadata (tables
must have the same or similar schemas) or column-based metrics (for example, the values …

Table understanding approaches for extracting knowledge from heterogeneous tables

S Bonfitto, E Casiraghi, M Mesiti - … reviews: Data mining and …, 2021 - Wiley Online Library
Table understanding methods extract, transform, and interpret the information contained in
tabular data embedded in documents/files of different formats. Such automatic …

A fully automated approach to a complete semantic table interpretation

M Cremaschi, F De Paoli, A Rula, B Spahiu - Future Generation Computer …, 2020 - Elsevier
In recent years, there has been an increasing interest in extracting and annotating tables on
the Web. This activity allows the transformation of text data into machine-readable formats to …

Sudowoodo: Contrastive self-supervised learning for multi-purpose data integration and preparation

R Wang, Y Li, J Wang - 2023 IEEE 39th International …, 2023 - ieeexplore.ieee.org
Machine learning (ML) is playing an increasingly important role in data management tasks,
particularly in Data Integration and Preparation (DI&P). The success of ML-based …

Tcn: Table convolutional network for web table interpretation

D Wang, P Shiralkar, C Lockard, B Huang… - Proceedings of the Web …, 2021 - dl.acm.org
Information extraction from semi-structured webpages provides valuable long-tailed facts for
augmenting knowledge graph. Relational Web tables are a critical component containing …

Domainnet: Homograph detection and understanding in data lake disambiguation

A Leventidis, L Di Rocco, W Gatterbauer… - ACM Transactions on …, 2023 - dl.acm.org
Modern data lakes are heterogeneous in the vocabulary that is used to describe data. We
study a problem of disambiguation in data lakes: How can we determine if a data value …

DomainNet: Homograph detection for data lake disambiguation

A Leventidis, L Di Rocco, W Gatterbauer… - arXiv preprint arXiv …, 2021 - arxiv.org
Modern data lakes are deeply heterogeneous in the vocabulary that is used to describe
data. We study a problem of disambiguation in data lakes: how can we determine if a data …