Stylegan knows normal, depth, albedo, and more

A Bhattad, D McKee, D Hoiem… - Advances in Neural …, 2024 - proceedings.neurips.cc
Intrinsic images, in the original sense, are image-like maps of scene properties like depth,
normal, albedo, or shading. This paper demonstrates that StyleGAN can easily be induced …

Beyond rgb: Scene-property synthesis with neural radiance fields

M Zhang, S Zheng, Z Bao, M Hebert… - Proceedings of the …, 2023 - openaccess.thecvf.com
Comprehensive 3D scene understanding, both geometrically and semantically, is important
for real-world applications such as robot perception. Most of the existing work has focused …

Generative models: What do they know? do they know things? let's find out!

X Du, N Kolkin, G Shakhnarovich, A Bhattad - arXiv preprint arXiv …, 2023 - arxiv.org
Generative models excel at mimicking real scenes, suggesting they might inherently encode
important intrinsic scene properties. In this paper, we aim to explore the following key …

Multi-task View Synthesis with Neural Radiance Fields

S Zheng, Z Bao, M Hebert… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Multi-task visual learning is a critical aspect of computer vision. Current research, however,
predominantly concentrates on the multi-task dense prediction setting, which overlooks the …

Consistent multimodal generation via a unified GAN framework

Z Zhu, Y Li, W Lyu, KK Singh, Z Shu… - Proceedings of the …, 2024 - openaccess.thecvf.com
We investigate how to generate multimodal image outputs, such as RGB, depth, and surface
normals, with a single generative model. The challenge is to produce outputs that are …

Joint-Task Regularization for Partially Labeled Multi-Task Learning

K Nishi, J Kim, W Li, H Pfister - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Multi-task learning has become increasingly popular in the machine learning field but its
practicality is hindered by the need for large labeled datasets. Most multi-task learning …

Closed-Loop Unsupervised Representation Disentanglement with -VAE Distillation and Diffusion Probabilistic Feedback

X Jin, B Li, B Xie, W Zhang, J Liu, Z Li, T Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
Representation disentanglement may help AI fundamentally understand the real world and
thus benefit both discrimination and generation tasks. It currently has at least three …

An inductive bias from quantum mechanics: learning order effects with non-commuting measurements

K Gili, G Alonso, M Schuld - Quantum Machine Intelligence, 2024 - Springer
There are two major approaches to building good machine learning algorithms: feeding lots
of data into large models or picking a model class with an “inductive bias” that suits the …

Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

S Zheng, Z Bao, R Zhao, M Hebert… - arXiv preprint arXiv …, 2024 - arxiv.org
Beyond high-fidelity image synthesis, diffusion models have recently exhibited promising
results in dense visual perception tasks. However, most existing work treats diffusion models …

Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning

Y Lu, S Cao, YX Wang - arXiv preprint arXiv:2410.14633, 2024 - arxiv.org
Vision Foundation Models (VFMs) have demonstrated outstanding performance on
numerous downstream tasks. However, due to their inherent representation biases …