Intriguing properties of generative classifiers

A Sauer, F Boesel, T Dockhorn, A Blattmann… - arXiv preprint arXiv …, 2024 - arxiv.org

Diffusion models are the main driver of progress in image and video synthesis, but suffer
from slow inference speed. Distillation methods, like the recently introduced adversarial …

被引用次数：29 相关文章所有 2 个版本

[PDF] thecvf.com

Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

S Koley, AK Bhunia, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com

This paper for the first time explores text-to-image diffusion models for Zero-Shot Sketch-
based Image Retrieval (ZS-SBIR). We highlight a pivotal discovery: the capacity of text-to …

被引用次数：5 相关文章所有 4 个版本

[PDF] arxiv.org

Getting aligned on representational alignment

I Sucholutsky, L Muttenthaler, A Weller, A Peng… - arXiv preprint arXiv …, 2023 - arxiv.org

Biological and artificial information processing systems form representations of the world
that they can use to categorize, reason, plan, navigate, and make decisions. To what extent …

被引用次数：38 相关文章所有 2 个版本

[PDF] thecvf.com

Visual anagrams: Generating multi-view optical illusions with diffusion models

D Geng, I Park, A Owens - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

We address the problem of synthesizing multi-view optical illusions: images that change
appearance upon a transformation such as a flip or rotation. We propose a simple zero-shot …

被引用次数：7 相关文章所有 4 个版本

[PDF] arxiv.org

An introduction to vision-language modeling

F Bordes, RY Pang, A Ajay, AC Li, A Bardes… - arXiv preprint arXiv …, 2024 - arxiv.org

Following the recent popularity of Large Language Models (LLMs), several attempts have
been made to extend them to the visual domain. From having a visual assistant that could …

被引用次数：14 相关文章所有 2 个版本

[PDF] thecvf.com

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models

S Motamed, W Van Gansbeke… - Proceedings of the …, 2024 - openaccess.thecvf.com

With recent advances in image and video diffusion models for content creation a plethora of
techniques have been proposed for customizing their generated content. In particular …

Can Biases in ImageNet Models Explain Generalization?

P Gavrikov, J Keuper - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

The robust generalization of models to rare in-distribution (ID) samples drawn from the long
tail of the training distribution and to out-of-training-distribution (OOD) samples is one of the …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Factorized diffusion: Perceptual illusions by noise decomposition

D Geng, I Park, A Owens - arXiv preprint arXiv:2404.11615, 2024 - arxiv.org

Given a factorization of an image into a sum of linear components, we present a zero-shot
method to control each individual component through diffusion model sampling. For …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

RadEdit: stress-testing biomedical vision models via diffusion image editing

F Pérez-García, S Bond-Taylor, PP Sanchez… - arXiv preprint arXiv …, 2023 - arxiv.org

Biomedical imaging datasets are often small and biased, meaning that real-world
performance of predictive models can be substantially lower than expected from internal …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Layerwise complexity-matched learning yields an improved model of cortical area V2

N Parthasarathy, OJ Hénaff, EP Simoncelli - arXiv preprint arXiv …, 2023 - arxiv.org

Human ability to recognize complex visual patterns arises through transformations
performed by successive areas in the ventral visual cortex. Deep neural networks trained …

被引用次数：1 相关文章所有 3 个版本

高级搜索

QQ 群