Open-vocabulary panoptic segmentation with text-to-image diffusion models

J Xu, S Liu, A Vahdat, W Byeon… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …

Instructpix2pix: Learning to follow image editing instructions

T Brooks, A Holynski, AA Efros - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We propose a method for editing images from human instructions: given an input image and
a written instruction that tells the model what to do, our model follows these instructions to …

Synthetic data from diffusion models improves imagenet classification

S Azizi, S Kornblith, C Saharia, M Norouzi… - arXiv preprint arXiv …, 2023 - arxiv.org
Deep generative models are becoming increasingly powerful, now generating diverse high
fidelity photo-realistic samples given text prompts. Have they reached the point where …

Drag your gan: Interactive point-based manipulation on the generative image manifold

X Pan, A Tewari, T Leimkühler, L Liu, A Meka… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org
Synthesizing visual content that meets users' needs often requires flexible and precise
controllability of the pose, shape, expression, and layout of the generated objects. Existing …

Gan inversion: A survey

W Xia, Y Zhang, Y Yang, JH Xue… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
GAN inversion aims to invert a given image back into the latent space of a pretrained GAN
model so that the image can be faithfully reconstructed from the inverted code by the …

Dataset diffusion: Diffusion-based synthetic data generation for pixel-level semantic segmentation

Q Nguyen, T Vu, A Tran… - Advances in Neural …, 2024 - proceedings.neurips.cc
Preparing training data for deep vision models is a labor-intensive task. To address this,
generative models have emerged as an effective solution for generating synthetic data …

Diversify your vision datasets with automatic diffusion-based augmentation

L Dunlap, A Umino, H Zhang, J Yang… - Advances in neural …, 2023 - proceedings.neurips.cc
Many fine-grained classification tasks, like rare animal identification, have limited training
data and consequently classifiers trained on these datasets often fail to generalize to …

Few shot semantic segmentation: a review of methodologies and open challenges

N Catalano, M Matteucci - arXiv preprint arXiv:2304.05832, 2023 - arxiv.org
Semantic segmentation assigns category labels to each pixel in an image, enabling
breakthroughs in fields such as autonomous driving and robotics. Deep Neural Networks …

Bigdatasetgan: Synthesizing imagenet with pixel-wise annotations

D Li, H Ling, SW Kim, K Kreis… - Proceedings of the …, 2022 - openaccess.thecvf.com
Annotating images with pixel-wise labels is a time-consuming and costly process. Recently,
DatasetGAN showcased a promising alternative-to synthesize a large labeled dataset via a …

Real-time radiance fields for single-image portrait view synthesis

A Trevithick, M Chan, M Stengel, E Chan… - ACM Transactions on …, 2023 - dl.acm.org
We present a one-shot method to infer and render a photorealistic 3D representation from a
single unposed image (eg, face portrait) in real-time. Given a single RGB input, our image …