Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com
This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

Contrastive representation learning: A framework and review

PH Le-Khac, G Healy, AF Smeaton - Ieee Access, 2020 - ieeexplore.ieee.org
Contrastive Learning has recently received interest due to its success in self-supervised
representation learning in the computer vision domain. However, the origins of Contrastive …

[HTML][HTML] Attention Is All You Need.(Nips), 2017

A Vaswani, N Shazeer, N Parmar, J Uszkoreit… - arXiv preprint arXiv …, 2017 - codetds.com
摘要占主导地位的序列转导模型基于复杂的递归或卷积神经网络, 包括编码器和解码器.
性能最好的模型还通过注意力机制连接编码器和解码器. 我们提出了一种新的简单网络架构 …

Llama: Open and efficient foundation language models

H Touvron, T Lavril, G Izacard, X Martinet… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B
parameters. We train our models on trillions of tokens, and show that it is possible to train …

Starcoder: may the source be with you!

R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov… - arXiv preprint arXiv …, 2023 - arxiv.org
The BigCode community, an open-scientific collaboration working on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder and …

Flamingo: a visual language model for few-shot learning

JB Alayrac, J Donahue, P Luc… - Advances in neural …, 2022 - proceedings.neurips.cc
Building models that can be rapidly adapted to novel tasks using only a handful of annotated
examples is an open challenge for multimodal machine learning research. We introduce …

Lamda: Language models for dialog applications

R Thoppilan, D De Freitas, J Hall, N Shazeer… - arXiv preprint arXiv …, 2022 - arxiv.org
We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of
Transformer-based neural language models specialized for dialog, which have up to 137B …

Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model

S Smith, M Patwary, B Norick, P LeGresley… - arXiv preprint arXiv …, 2022 - arxiv.org
Pretrained general-purpose language models can achieve state-of-the-art accuracies in
various natural language processing domains by adapting to downstream tasks via zero …

Improving language models by retrieving from trillions of tokens

S Borgeaud, A Mensch, J Hoffmann… - International …, 2022 - proceedings.mlr.press
We enhance auto-regressive language models by conditioning on document chunks
retrieved from a large corpus, based on local similarity with preceding tokens. With a 2 …

Scaling language models: Methods, analysis & insights from training gopher

JW Rae, S Borgeaud, T Cai, K Millican… - arXiv preprint arXiv …, 2021 - arxiv.org
Language modelling provides a step towards intelligent communication systems by
harnessing large repositories of written human knowledge to better predict and understand …