- 学术资源搜索

Robust fine-tuning of zero-shot models

M Wortsman, G Ilharco, JW Kim, M Li… - Proceedings of the …, 2022 - openaccess.thecvf.com

Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of
data distributions when performing zero-shot inference (ie, without fine-tuning on a specific …

被引用次数：669 相关文章所有 9 个版本

[PDF] arxiv.org

Fusing finetuned models for better pretraining

L Choshen, E Venezian, N Slonim, Y Katz - arXiv preprint arXiv …, 2022 - arxiv.org

Pretrained models are the standard starting point for training. This approach consistently
outperforms the use of a random initialization. However, pretraining is a costly endeavour …

被引用次数：65 相关文章所有 3 个版本

[PDF] mlr.press

Proving linear mode connectivity of neural networks via optimal transport

D Ferbach, B Goujaud, G Gidel… - International …, 2024 - proceedings.mlr.press

The energy landscape of high-dimensional non-convex optimization problems is crucial to
understanding the effectiveness of modern deep neural network architectures. Recent works …

被引用次数：12 相关文章所有 13 个版本

[PDF] neurips.cc

Model zoos: A dataset of diverse populations of neural network models

K Schürholt, D Taskiran, B Knyazev… - Advances in …, 2022 - proceedings.neurips.cc

In the last years, neural networks (NN) have evolved from laboratory environments to the
state-of-the-art for many real-world problems. It was shown that NN models (ie, their weights …

被引用次数：28 相关文章所有 11 个版本

[PDF] mlr.press

What can linear interpolation of neural network loss landscapes tell us?

TJ Vlaar, J Frankle - International Conference on Machine …, 2022 - proceedings.mlr.press

Studying neural network loss landscapes provides insights into the nature of the underlying
optimization problems. Unfortunately, loss landscapes are notoriously difficult to visualize in …

被引用次数：28 相关文章所有 8 个版本

[PDF] arxiv.org

The empirical impact of neural parameter symmetries, or lack thereof

D Lim, TM Putterman, R Walters, H Maron… - arXiv preprint arXiv …, 2024 - arxiv.org

Many algorithms and observed phenomena in deep learning appear to be affected by
parameter symmetries--transformations of neural network parameters that do not change the …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

End-to-end bias mitigation: Removing gender bias in deep learning

T Feldman, A Peake - arXiv preprint arXiv:2104.02532, 2021 - arxiv.org

Machine Learning models have been deployed across many different aspects of society,
often in situations that affect social welfare. Although these models offer streamlined …

被引用次数：37 相关文章所有 3 个版本

[PDF] arxiv.org

Merging by matching models in task subspaces

D Tam, M Bansal, C Raffel - arXiv preprint arXiv:2312.04339, 2023 - arxiv.org

Model merging aims to cheaply combine individual task-specific models into a single
multitask model. In this work, we view past merging methods as leveraging different notions …

被引用次数：10 相关文章所有 2 个版本

[PDF] openreview.net

Robustness of edited neural networks

D Brown, C Godfrey, C Nizinski, J Tu… - ICLR 2023 Workshop on …, 2023 - openreview.net

Successful deployment in uncertain, real-world environments requires that deep learning
models can be efficiently and reliably modified in order to adapt to unexpected issues …

被引用次数：15 相关文章

[PDF] mlr.press

Towards noise-adaptive, problem-adaptive (accelerated) stochastic gradient descent

S Vaswani, B Dubois-Taine… - … on machine learning, 2022 - proceedings.mlr.press

We aim to make stochastic gradient descent (SGD) adaptive to (i) the noise $\sigma^ 2$ in
the stochastic gradients and (ii) problem-dependent constants. When minimizing smooth …

被引用次数：13 相关文章所有 5 个版本

高级搜索

QQ 群

Robust fine-tuning of zero-shot models

Fusing finetuned models for better pretraining

Proving linear mode connectivity of neural networks via optimal transport

Model zoos: A dataset of diverse populations of neural network models

What can linear interpolation of neural network loss landscapes tell us?

The empirical impact of neural parameter symmetries, or lack thereof

End-to-end bias mitigation: Removing gender bias in deep learning

Merging by matching models in task subspaces

Robustness of edited neural networks

Towards noise-adaptive, problem-adaptive (accelerated) stochastic gradient descent

引用