Too much in common: Shifting of embeddings in transformer language models and its implications

D Biś, M Podkorytov, X Liu - … of the 2021 conference of the North …, 2021 - aclanthology.org
The success of language models based on the Transformer architecture appears to be
inconsistent with observed anisotropic properties of representations learned by such …

A survey on machine learning from few samples

J Lu, P Gong, J Ye, J Zhang, C Zhang - Pattern Recognition, 2023 - Elsevier
The capability of learning and generalizing from very few samples successfully is a
noticeable demarcation separating artificial intelligence and human intelligence. Despite the …

Reconsidering Token Embeddings with the Definitions for Pre-trained Language Models

Y Zhang, D Li, M Okumura - arXiv preprint arXiv:2408.01308, 2024 - arxiv.org
Learning token embeddings based on token co-occurrence statistics has proven effective for
both pre-training and fine-tuning in natural language processing. However, recent studies …

Neural variational learning for grounded language acquisition

N Pillai, C Matuszek, F Ferraro - 2021 30th IEEE International …, 2021 - ieeexplore.ieee.org
We propose a learning system in which language is grounded in visual percepts without
specific pre-defined categories of terms. We present a unified generative method to acquire …

[HTML][HTML] A deep learning transformer model predicts high rates of undiagnosed rare disease in large electronic health systems

DM Jordan, HMT Vy, R Do - medRxiv, 2023 - ncbi.nlm.nih.gov
It is estimated that as many as 1 in 16 people worldwide suffer from rare diseases. Rare
disease patients face difficulty finding diagnosis and treatment for their conditions, including …

Few-shot learning with language models: Learning from instructions and contexts

T Schick - 2022 - edoc.ub.uni-muenchen.de
Pretraining deep neural networks to perform language modeling–that is, to reconstruct
missing words from incomplete pieces of text–has brought large improvements throughout …