Minimum risk training for neural machine translation

S Shen, Y Cheng, Z He, W He, H Wu, M Sun… - arXiv preprint arXiv …, 2015 - arxiv.org
We propose minimum risk training for end-to-end neural machine translation. Unlike
conventional maximum likelihood estimation, minimum risk training is capable of optimizing …

Quality-aware decoding for neural machine translation

P Fernandes, A Farinhas, R Rei, JGC de Souza… - arXiv preprint arXiv …, 2022 - arxiv.org
Despite the progress in machine translation quality estimation and evaluation in the last
years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers …

Beyond BLEU: training neural machine translation with semantic similarity

J Wieting, T Berg-Kirkpatrick, K Gimpel… - arXiv preprint arXiv …, 2019 - arxiv.org
While most neural machine translation (NMT) systems are still trained using maximum
likelihood estimation, recent work has demonstrated that optimizing systems to directly …

Classical structured prediction losses for sequence to sequence learning

S Edunov, M Ott, M Auli, D Grangier… - arXiv preprint arXiv …, 2017 - arxiv.org
There has been much recent work on training neural attention models at the sequence-level
using either reinforcement learning-style methods or by optimizing the beam. In this paper …

From language to programs: Bridging reinforcement learning and maximum marginal likelihood

K Guu, P Pasupat, EZ Liu, P Liang - arXiv preprint arXiv:1704.07926, 2017 - arxiv.org
Our goal is to learn a semantic parser that maps natural language utterances into
executable programs when only indirect supervision is available: examples are labeled with …

High quality rather than high model probability: Minimum Bayes risk decoding with neural metrics

M Freitag, D Grangier, Q Tan, B Liang - Transactions of the …, 2022 - direct.mit.edu
Abstract In Neural Machine Translation, it is typically assumed that the sentence with the
highest estimated probability should also be the translation with the highest quality as …

Recent advances on neural headline generation

Ayana, SQ Shen, YK Lin, CC Tu, Y Zhao, ZY Liu… - Journal of computer …, 2017 - Springer
Recently, neural models have been proposed for headline generation by learning to map
documents to headlines with recurrent neural network. In this work, we give a detailed …

[PDF][PDF] Batch tuning strategies for statistical machine translation

C Cherry, G Foster - Proceedings of the 2012 conference of the …, 2012 - aclanthology.org
There has been a proliferation of recent work on SMT tuning algorithms capable of handling
larger feature sets than the traditional MERT approach. We analyze a number of these …

LENS: A learnable evaluation metric for text simplification

M Maddela, Y Dou, D Heineman, W Xu - arXiv preprint arXiv:2212.09739, 2022 - arxiv.org
Training learnable metrics using modern language models has recently emerged as a
promising method for the automatic evaluation of machine translation. However, existing …

Differentiable dynamic programming for structured prediction and attention

A Mensch, M Blondel - International Conference on Machine …, 2018 - proceedings.mlr.press
Dynamic programming (DP) solves a variety of structured combinatorial problems by
iteratively breaking them down into smaller subproblems. In spite of their versatility, many …