Self-supervised deep language modeling has shown unprecedented success across natural language tasks, and has recently been repurposed to biological sequences. However …
Large language models have been shown to struggle with multi-step reasoning, and do not retain previous reasoning steps for future use. We propose a simple method for solving both …
Y Li, S Si, G Li, CJ Hsieh… - Advances in Neural …, 2021 - proceedings.neurips.cc
Attentional mechanisms are order-invariant. Positional encoding is a crucial component to allow attention-based deep model architectures such as Transformer to address sequences …
Abstract The evolution of Neural Machine Translation (NMT) has been significantly influenced by six core challenges (Koehn and Knowles,) that have acted as benchmarks for …
Z Ma, Z Dou, Y Zhu, H Zhong, JR Wen - Proceedings of the 44th …, 2021 - dl.acm.org
Personalized chatbots focus on endowing chatbots with a consistent personality to behave like real users, give more informative responses, and further act as personal assistants …
D Variš, O Bojar - arXiv preprint arXiv:2109.07276, 2021 - arxiv.org
Transformer-based sequence-to-sequence architectures, while achieving state-of-the-art results on a large number of NLP tasks, can still suffer from overfitting during training. In …
Short texts (STs) present in a variety of scenarios, including query, dialog, and entity names. Most of the exciting studies in neural machine translation (NMT) are focused on tackling …
Position representation is crucial for building position-aware representations in Transformers. Existing position representations suffer from a lack of generalization to test …
Large-scale crop type classification is a task at the core of remote sensing efforts with applications of both economic and ecological importance. Current state-of-the-art deep …