Locally typical sampling

C Meister, T Pimentel, G Wiher… - Transactions of the …, 2023 - direct.mit.edu
Today's probabilistic language generators fall short when it comes to producing coherent
and fluent text despite the fact that the underlying models perform well under standard …

Language model evaluation beyond perplexity

C Meister, R Cotterell - arXiv preprint arXiv:2106.00085, 2021 - arxiv.org
We propose an alternate approach to quantifying how well language models learn natural
language: we ask how well they match the statistical tendencies of natural language. To …

Gradient-based constrained sampling from language models

S Kumar, B Paria, Y Tsvetkov - arXiv preprint arXiv:2205.12558, 2022 - arxiv.org
Large pretrained language models generate fluent text but are notoriously hard to
controllably sample from. In this work, we study constrained sampling from such language …

[PDF][PDF] Fully abstractive approach to guided summarization

PE Genest, G Lapalme - Proceedings of the 50th Annual Meeting …, 2012 - aclanthology.org
This paper shows that full abstraction can be accomplished in the context of guided
summarization. We describe a work in progress that relies on Information Extraction …

A reparameterized discrete diffusion model for text generation

L Zheng, J Yuan, L Yu, L Kong - arXiv preprint arXiv:2302.05737, 2023 - arxiv.org
This work studies discrete diffusion probabilistic models with applications to natural
language generation. We derive an alternative yet equivalent formulation of the sampling …

Large language models demonstrate the potential of statistical learning in language

P Contreras Kallens… - Cognitive …, 2023 - Wiley Online Library
To what degree can language be acquired from linguistic input alone? This question has
vexed scholars for millennia and is still a major focus of debate in the cognitive science of …

Truncation sampling as language model desmoothing

J Hewitt, CD Manning, P Liang - arXiv preprint arXiv:2210.15191, 2022 - arxiv.org
Long samples of text from neural language models can be of poor quality. Truncation
sampling algorithms--like top-$ p $ or top-$ k $--address this by setting some words' …

[PDF][PDF] Phrase-based statistical language generation using graphical models and active learning

F Mairesse, M Gasic, F Jurcicek, S Keizer… - Proceedings of the …, 2010 - aclanthology.org
Most previous work on trainable language generation has focused on two paradigms:(a)
using a statistical model to rank a set of generated utterances, or (b) using statistics to inform …

On decoding strategies for neural text generators

G Wiher, C Meister, R Cotterell - Transactions of the Association for …, 2022 - direct.mit.edu
When generating text from probabilistic models, the chosen decoding strategy has a
profound effect on the resulting text. Yet the properties elicited by various decoding …

A neural probabilistic language model

Y Bengio, R Ducharme… - Advances in neural …, 2000 - proceedings.neurips.cc
A goal of statistical language modeling is to learn the joint probability function of sequences
of words. This is intrinsically difficult because of the curse of dimensionality: we propose to …