- 学术资源搜索

Domain generalization: A survey

K Zhou, Z Liu, Y Qiao, T Xiang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet
challenging for machines to reproduce. This is because most learning algorithms strongly …

被引用次数：1060 相关文章所有 9 个版本

[PDF] nature.com

Digital medicine and the curse of dimensionality

V Berisha, C Krantsevich, PR Hahn, S Hahn… - NPJ digital …, 2021 - nature.com

Digital health data are multimodal and high-dimensional. A patient's health state can be
characterized by a multitude of signals including medical imaging, clinical variables …

被引用次数：210 相关文章所有 9 个版本

[PDF] arxiv.org

Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arXiv preprint arXiv …, 2023 - arxiv.org

The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

被引用次数：1353 相关文章所有 11 个版本

[PDF] thecvf.com

Eva: Exploring the limits of masked visual representation learning at scale

Y Fang, W Wang, B Xie, Q Sun, L Wu… - Proceedings of the …, 2023 - openaccess.thecvf.com

We launch EVA, a vision-centric foundation model to explore the limits of visual
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …

被引用次数：535 相关文章所有 5 个版本

[PDF] mlr.press

Scaling vision transformers to 22 billion parameters

M Dehghani, J Djolonga, B Mustafa… - International …, 2023 - proceedings.mlr.press

The scaling of Transformers has driven breakthrough capabilities for language models. At
present, the largest large language models (LLMs) contain upwards of 100B parameters …

被引用次数：389 相关文章所有 9 个版本

[PDF] thecvf.com

Reproducible scaling laws for contrastive language-image learning

M Cherti, R Beaumont, R Wightman… - Proceedings of the …, 2023 - openaccess.thecvf.com

Scaling up neural networks has led to remarkable performance across a wide range of
tasks. Moreover, performance often follows reliable scaling laws as a function of training set …

被引用次数：494 相关文章所有 6 个版本

[PDF] mlr.press

Robust speech recognition via large-scale weak supervision

A Radford, JW Kim, T Xu, G Brockman… - International …, 2023 - proceedings.mlr.press

We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …

被引用次数：2785 相关文章所有 11 个版本

[PDF] neurips.cc

Symbolic discovery of optimization algorithms

X Chen, C Liang, D Huang, E Real… - Advances in neural …, 2024 - proceedings.neurips.cc

We present a method to formulate algorithm discovery as program search, and apply it to
discover optimization algorithms for deep neural network training. We leverage efficient …

被引用次数：318 相关文章所有 7 个版本

[PDF] thecvf.com

Maple: Multi-modal prompt learning

MU Khattak, H Rasheed, M Maaz… - Proceedings of the …, 2023 - openaccess.thecvf.com

Pre-trained vision-language (VL) models such as CLIP have shown excellent generalization
ability to downstream tasks. However, they are sensitive to the choice of input text prompts …

被引用次数：470 相关文章所有 10 个版本

[PDF] thecvf.com

Sigmoid loss for language image pre-training

X Zhai, B Mustafa, A Kolesnikov… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose a simple pairwise sigmoid loss for image-text pre-training. Unlike standard
contrastive learning with softmax normalization, the sigmoid loss operates solely on image …

被引用次数：264 相关文章所有 5 个版本

高级搜索

QQ 群

Domain generalization: A survey

Digital medicine and the curse of dimensionality

Dinov2: Learning robust visual features without supervision

Eva: Exploring the limits of masked visual representation learning at scale

Scaling vision transformers to 22 billion parameters

Reproducible scaling laws for contrastive language-image learning

Robust speech recognition via large-scale weak supervision

Symbolic discovery of optimization algorithms

Maple: Multi-modal prompt learning

Sigmoid loss for language image pre-training

引用