Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

Rare words: A major problem for contextualized embeddings and how to fix it by attentive mimicking

T Schick, H Schütze - Proceedings of the AAAI Conference on Artificial …, 2020 - ojs.aaai.org
Pretraining deep neural network architectures with a language modeling objective has
brought large improvements for many natural language processing tasks. Exemplified by …

Probing for idiomaticity in vector space models

M Garcia, TK Vieira, C Scarton… - Proceedings of the …, 2021 - eprints.whiterose.ac.uk
Contextualised word representation models have been successfully used for capturing
different word usages and they may be an attractive alternative for representing idiomaticity …

AStitchInLanguageModels: Dataset and methods for the exploration of idiomaticity in pre-trained language models

HT Madabushi, E Gow-Smith, C Scarton… - arXiv preprint arXiv …, 2021 - arxiv.org
Despite their success in a variety of NLP tasks, pre-trained language models, due to their
heavy reliance on compositionality, fail in effectively capturing the meanings of multiword …

Too much in common: Shifting of embeddings in transformer language models and its implications

D Biś, M Podkorytov, X Liu - … of the 2021 conference of the North …, 2021 - aclanthology.org
The success of language models based on the Transformer architecture appears to be
inconsistent with observed anisotropic properties of representations learned by such …

Imputing out-of-vocabulary embeddings with love makes language models robust with little cost

L Chen, G Varoquaux, FM Suchanek - arXiv preprint arXiv:2203.07860, 2022 - arxiv.org
State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle
when faced with Out-of-Vocabulary (OOV) words. To address this issue, we follow the …

Assessing idiomaticity representations in vector models with a noun compound dataset labeled at type and token levels

M Garcia, T Kramer Vieira, C Scarton… - Proceedings of ACL …, 2021 - eprints.whiterose.ac.uk
Accurate assessment of the ability of embedding models to capture idiomaticity may require
evaluation at token rather than type level, to account for degrees of idiomaticity and possible …

Graph-based Relation Mining for Context-free Out-of-vocabulary Word Embedding Learning

Z Liang, Y Lu, HG Chen, Y Rao - … of the 61st Annual Meeting of …, 2023 - aclanthology.org
The out-of-vocabulary (OOV) words are difficult to represent while critical to the performance
of embedding-based downstream models. Prior OOV word embedding learning methods …

Representation learning via variational bayesian networks

O Barkan, A Caciularu, I Rejwan, O Katz… - Proceedings of the 30th …, 2021 - dl.acm.org
We present Variational Bayesian Network (VBN)-a novel Bayesian entity representation
learning model that utilizes hierarchical and relational side information and is particularly …

Lacking the embedding of a word? look it up into a traditional dictionary

ES Ruzzetti, L Ranaldi, M Mastromattei… - arXiv preprint arXiv …, 2021 - arxiv.org
Word embeddings are powerful dictionaries, which may easily capture language variations.
However, these dictionaries fail to give sense to rare words, which are surprisingly often …