Leaving reality to imagination: Robust classification via generated datasets

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org

Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

被引用次数：933 相关文章所有 6 个版本

[PDF] arxiv.org

Synthetic data from diffusion models improves imagenet classification

S Azizi, S Kornblith, C Saharia, M Norouzi… - arXiv preprint arXiv …, 2023 - arxiv.org

Deep generative models are becoming increasingly powerful, now generating diverse high
fidelity photo-realistic samples given text prompts. Have they reached the point where …

被引用次数：203 相关文章所有 3 个版本

[PDF] neurips.cc

Improving multimodal datasets with image captioning

T Nguyen, SY Gadre, G Ilharco… - Advances in Neural …, 2024 - proceedings.neurips.cc

Massive web datasets play a key role in the success of large vision-language models like
CLIP and Flamingo. However, the raw web data is noisy, and existing filtering methods to …

被引用次数：32 相关文章所有 6 个版本

[PDF] neurips.cc

Dream the impossible: Outlier imagination with diffusion models

X Du, Y Sun, J Zhu, Y Li - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Utilizing auxiliary outlier datasets to regularize the machine learning model has
demonstrated promise for out-of-distribution (OOD) detection and safe prediction. Due to the …

被引用次数：18 相关文章所有 6 个版本

[PDF] thecvf.com

Diverse data augmentation with diffusions for effective test-time prompt tuning

CM Feng, K Yu, Y Liu, S Khan… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Benefiting from prompt tuning, recent years have witnessed the promising performance of
pre-trained vision-language models, eg, CLIP, on versatile downstream tasks. In this paper …

被引用次数：27 相关文章所有 5 个版本

[PDF] arxiv.org

Self-consuming generative models go mad

S Alemohammad, J Casco-Rodriguez, L Luzi… - arXiv preprint arXiv …, 2023 - arxiv.org

Seismic advances in generative AI algorithms for imagery, text, and other data types has led
to the temptation to use synthetic data to train next-generation models. Repeating this …

被引用次数：72 相关文章所有 4 个版本

[PDF] neurips.cc

Freemask: Synthetic images with dense annotations make stronger segmentation models

L Yang, X Xu, B Kang, Y Shi… - Advances in Neural …, 2024 - proceedings.neurips.cc

Semantic segmentation has witnessed tremendous progress due to the proposal of various
advanced network architectures. However, they are extremely hungry for delicate …

被引用次数：16 相关文章所有 7 个版本

[PDF] openreview.net

Fine-tuning multimodal llms to follow zero-shot demonstrative instructions

J Li, K Pan, Z Ge, M Gao, W Ji, W Zhang… - The Twelfth …, 2023 - openreview.net

Recent advancements in Multimodal Large Language Models (MLLMs) have been utilizing
Visual Prompt Generators (VPGs) to convert visual features into tokens that LLMs can …

被引用次数：28 相关文章所有 2 个版本

[PDF] thecvf.com

Waffling around for performance: Visual classification with random words and broad concepts

K Roth, JM Kim, A Koepke, O Vinyals… - Proceedings of the …, 2023 - openaccess.thecvf.com

The visual classification performance of vision-language models such as CLIP has been
shown to benefit from additional semantic knowledge from large language models (LLMs) …

被引用次数：30 相关文章所有 5 个版本

[PDF] thecvf.com

Diversity is definitely needed: Improving model-agnostic zero-shot classification via stable diffusion

J Shipard, A Wiliem, KN Thanh… - Proceedings of the …, 2023 - openaccess.thecvf.com

In this work, we investigate the problem of Model-Agnostic Zero-Shot Classification (MA-
ZSC), which refers to training non-specific classification architectures (downstream models) …

被引用次数：34 相关文章所有 8 个版本

高级搜索

QQ 群