Evolution of semantic similarity—a survey

D Chandrasekaran, V Mago - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Estimating the semantic similarity between text data is one of the challenging and open
research problems in the field of Natural Language Processing (NLP). The versatility of …

Analysis methods in neural language processing: A survey

Y Belinkov, J Glass - … of the Association for Computational Linguistics, 2019 - direct.mit.edu
The field of natural language processing has seen impressive progress in recent years, with
neural network models replacing many of the traditional systems. A plethora of new models …

Evaluating word embedding models: Methods and experimental results

B Wang, A Wang, F Chen, Y Wang… - APSIPA transactions on …, 2019 - cambridge.org
Extensive evaluation on a large number of word embedding models for language
processing applications is conducted in this work. First, we introduce popular word …

A survey of cross-lingual word embedding models

S Ruder, I Vulić, A Søgaard - Journal of Artificial Intelligence Research, 2019 - jair.org
Cross-lingual representations of words enable us to reason about word meaning in
multilingual contexts and are a key facilitator of cross-lingual transfer when developing …

Interpreting pretrained contextualized representations via reductions to static embeddings

R Bommasani, K Davis, C Cardie - … of the 58th Annual Meeting of …, 2020 - aclanthology.org
Contextualized representations (eg ELMo, BERT) have become the default pretrained
representations for downstream NLP applications. In some settings, this transition has …

On the limitations of unsupervised bilingual dictionary induction

A Søgaard, S Ruder, I Vulić - arXiv preprint arXiv:1805.03620, 2018 - arxiv.org
Unsupervised machine translation---ie, not assuming any cross-lingual supervision signal,
whether a dictionary, translations, or comparable corpora---seems impossible, but …

All-but-the-top: Simple and effective postprocessing for word representations

J Mu, S Bhat, P Viswanath - arXiv preprint arXiv:1702.01417, 2017 - arxiv.org
Real-valued word representations have transformed NLP applications; popular examples
are word2vec and GloVe, recognized for their ability to capture linguistic regularities. In this …

All bark and no bite: Rogue dimensions in transformer language models obscure representational quality

W Timkey, M Van Schijndel - arXiv preprint arXiv:2109.04404, 2021 - arxiv.org
Similarity measures are a vital tool for understanding how language models represent and
process language. Standard representational similarity measures such as cosine similarity …

A survey of word embeddings evaluation methods

A Bakarov - arXiv preprint arXiv:1801.09536, 2018 - arxiv.org
Word embeddings are real-valued word representations able to capture lexical semantics
and trained on natural language corpora. Models proposing these representations have …

Distributional models of word meaning

A Lenci - Annual review of Linguistics, 2018 - annualreviews.org
Distributional semantics is a usage-based model of meaning, based on the assumption that
the statistical distribution of linguistic items in context plays a key role in characterizing their …