Is MAP decoding all you need? the inadequacy of the mode in neural machine translation

B Eikema, W Aziz - arXiv preprint arXiv:2005.10283, 2020 - arxiv.org
Recent studies have revealed a number of pathologies of neural machine translation (NMT)
systems. Hypotheses explaining these mostly suggest there is something fundamentally …

On reinforcement learning and distribution matching for fine-tuning language models with no catastrophic forgetting

T Korbak, H Elsahar, G Kruszewski… - Advances in Neural …, 2022 - proceedings.neurips.cc
The availability of large pre-trained models is changing the landscape of Machine Learning
research and practice, moving from a" training from scratch" to a" fine-tuning''paradigm …

Discriminative reranking for neural machine translation

A Lee, M Auli, MA Ranzato - … of the 59th Annual Meeting of the …, 2021 - aclanthology.org
Reranking models enable the integration of rich features to select a better output hypothesis
within an n-best list or lattice. These models have a long history in NLP, and we revisit …

Residual energy-based models for text

A Bakhtin, Y Deng, S Gross, M Ott, MA Ranzato… - Journal of Machine …, 2021 - jmlr.org
Current large-scale auto-regressive language models (Radford et al., 2019; Liu et al., 2018;
Graves, 2013) display impressive fluency and can generate convincing text. In this work we …

Versatile energy-based probabilistic models for high energy physics

T Cheng, AC Courville - Advances in Neural Information …, 2024 - proceedings.neurips.cc
As a classical generative modeling approach, energy-based models have the natural
advantage of flexibility in the form of the energy function. Recently, energy-based models …

Metaphor generation based on noval evaluation method

C Su, X Wang, Y Chang, K Wu, Y Chen - Neurocomputing, 2024 - Elsevier
Metaphor generation is a difficult research area to study. In this task, the generated content
must maintain certain elements of the original content, such as verbs, adjectives, and …

Improving joint training of inference networks and structured prediction energy networks

L Tu, RY Pang, K Gimpel - arXiv preprint arXiv:1911.02891, 2019 - arxiv.org
Deep energy-based models are powerful, but pose challenges for learning and inference
(Belanger and McCallum, 2016). Tu and Gimpel (2018) developed an efficient framework for …

Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs

B Eikema, G Kruszewski, H Elsahar… - arXiv preprint arXiv …, 2021 - arxiv.org
Energy-Based Models (EBMs) allow for extremely flexible specifications of probability
distributions. However, they do not provide a mechanism for obtaining exact samples from …

Versatile Energy-Based Models for High Energy Physics

T Cheng, A Courville - 2023 - openreview.net
Energy-Based Models (EBMs) have the natural advantage of flexibility in the form of the
energy function. Recently, EBMs have achieved great success in modeling high …

E-Forcing: Improving Autoregressive Models by Treating it as an Energy-Based One

Y Wang, T Che, B Li, K Song, H Pei, Y Bengio, D Li - openreview.net
Autoregressive generative models are commonly used to solve tasks involving sequential
data. They have, however, been plagued by a slew of inherent flaws due to the intrinsic …