Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation

Z Kasner, O Dušek - Proceedings of the 62nd Annual Meeting of …, 2024 - aclanthology.org
We analyze the behaviors of open large language models (LLMs) on the task of data-to-text
(D2T) generation, ie, generating coherent and relevant text from structured data. To avoid …

Data-to-text generation for severely under-resourced languages with gpt-3.5: A bit of help needed from google translate

M Lorandi, A Belz - arXiv preprint arXiv:2308.09957, 2023 - arxiv.org
LLMs like GPT are great at tasks involving English which dominates in their training data. In
this paper, we look at how they cope with tasks involving languages that are severely under …

Beyond Reference-Based Metrics: Analyzing Behaviors of Open LLMs on Data-to-Text Generation

Z Kasner, O Dušek - arXiv preprint arXiv:2401.10186, 2024 - arxiv.org
We investigate to which extent open large language models (LLMs) can generate coherent
and relevant text from structured data. To prevent bias from benchmarks leaked into LLM …

High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models

M Lorandi, A Belz - arXiv preprint arXiv:2402.12267, 2024 - arxiv.org
The performance of NLP methods for severely under-resourced languages cannot currently
hope to match the state of the art in NLP methods for well resourced languages. We explore …

Curriculum Learning for Cross-Lingual Data-to-Text Generation With Noisy Data

KA Hari, M Gupta, V Varma - arXiv preprint arXiv:2412.13484, 2024 - arxiv.org
Curriculum learning has been used to improve the quality of text generation systems by
ordering the training samples according to a particular schedule in various tasks. In the …

Evaluating RDF-to-text Generation Models for English and Russian on Out Of Domain Data

A Nikiforovskaya, C Gardent - Proceedings of the 17th …, 2024 - aclanthology.org
While the WebNLG dataset has prompted much research on generation from knowledge
graphs, little work has examined how well models trained on the WebNLG data generalise …

DCU-ADAPT-modPB at the GEM'24 Data-to-Text Generation Task: Model Hybridisation for Pipeline Data-to-Text Natural Language Generation

CC Osuji, R Huidrom, KJ Adebayo… - Proceedings of the …, 2024 - aclanthology.org
In this paper, we present our approach to the GEM Shared Task at the INLG'24 Generation
Challenges, which focuses on generating data-to-text in multiple languages, including low …

DipInfo-UniTo at the GEM'24 Data-to-Text Task: Augmenting LLMs with the Split-Generate-Aggregate Pipeline

M Oliverio, PF Balestrucci, A Mazzei… - Proceedings of the 17th …, 2024 - iris.unito.it
This paper describes the DipInfo-UniTo system participating to the GEM shared task 2024.
We participate only to the Data-to-Text (D2T) task. The DipInfo-UniTo system is based on …

Exploring the impact of data representation on neural data-to-text generation

DM Howcroft, LN Watson, O Nedopas… - Proceedings of the 17th …, 2024 - aclanthology.org
A relatively under-explored area in research on neural natural language generation is the
impact of the data representation on text quality. Here we report experiments on two leading …

Data-to-Text Generation with Neural Language Models

Z Kasner - 2024 - dspace.cuni.cz
Data-to-text generation systems need to produce texts with high levels of seman-tic
accuracy. Rule-based systems can guarantee this aspect, but their fluency and adaptability …