Locally typical sampling

C Meister, T Pimentel, G Wiher… - Transactions of the …, 2023 - direct.mit.edu
Today's probabilistic language generators fall short when it comes to producing coherent
and fluent text despite the fact that the underlying models perform well under standard …

[HTML][HTML] Testing the predictions of surprisal theory in 11 languages

EG Wilcox, T Pimentel, C Meister, R Cotterell… - Transactions of the …, 2023 - direct.mit.edu
Surprisal theory posits that less-predictable words should take more time to process, with
word predictability quantified as surprisal, ie, negative log probability in context. While …

Revisiting the uniform information density hypothesis

C Meister, T Pimentel, P Haller, L Jäger… - arXiv preprint arXiv …, 2021 - arxiv.org
The uniform information density (UID) hypothesis posits a preference among language
users for utterances structured such that information is distributed uniformly across a signal …

On the probability-quality paradox in language generation

C Meister, G Wiher, T Pimentel, R Cotterell - arXiv preprint arXiv …, 2022 - arxiv.org
When generating natural language from neural probabilistic models, high probability does
not always coincide with high quality: It has often been observed that mode-seeking …

Revisiting the optimality of word lengths

T Pimentel, C Meister, EG Wilcox, K Mahowald… - arXiv preprint arXiv …, 2023 - arxiv.org
Zipf (1935) posited that wordforms are optimized to minimize utterances' communicative
costs. Under the assumption that cost is given by an utterance's length, he supported this …

[HTML][HTML] A Cross-Linguistic Pressure for Uniform Information Density in Word Order

TH Clark, C Meister, T Pimentel, M Hahn… - Transactions of the …, 2023 - direct.mit.edu
While natural languages differ widely in both canonical word order and word order flexibility,
their word orders still follow shared cross-linguistic statistical patterns, often attributed to …

An information-theoretic analysis of self-supervised discrete representations of speech

BM Abdullah, MM Shaik, B Möbius… - arXiv preprint arXiv …, 2023 - arxiv.org
Self-supervised representation learning for speech often involves a quantization step that
transforms the acoustic input into discrete units. However, it remains unclear how to …

Quantifying the redundancy between prosody and text

L Wolf, T Pimentel, E Fedorenko, R Cotterell… - arXiv preprint arXiv …, 2023 - arxiv.org
Prosody--the suprasegmental component of speech, including pitch, loudness, and tempo--
carries critical aspects of meaning. However, the relationship between the information …

Using linguistic typology to enrich multilingual lexicons: the case of lexical gaps in kinship

T Khishigsuren, G Bella, K Batsuren, AA Freihat… - arXiv preprint arXiv …, 2022 - arxiv.org
This paper describes a method to enrich lexical resources with content relating to linguistic
diversity, based on knowledge from the field of lexical typology. We capture the …

Grammatical cues to subjecthood are redundant in a majority of simple clauses across languages

K Mahowald, E Diachek, E Gibson, E Fedorenko… - Cognition, 2023 - Elsevier
Grammatical cues are sometimes redundant with word meanings in natural language. For
instance, English word order rules constrain the word order of a sentence like “The dog …