Multimodal image synthesis and editing: A survey and taxonomy

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

Nerf: Neural radiance field in 3d vision, a comprehensive review

K Gao, Y Gao, H He, D Lu, L Xu, J Li - arXiv preprint arXiv:2210.00379, 2022 - arxiv.org
Neural Radiance Field (NeRF), a new novel view synthesis with implicit scene
representation has taken the field of Computer Vision by storm. As a novel view synthesis …

Instruct-nerf2nerf: Editing 3d scenes with instructions

A Haque, M Tancik, AA Efros… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a method for editing NeRF scenes with text-instructions. Given a NeRF of a
scene and the collection of images used to reconstruct it, our method uses an image …

Nerfstudio: A modular framework for neural radiance field development

M Tancik, E Weber, E Ng, R Li, B Yi, T Wang… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org
Neural Radiance Fields (NeRF) are a rapidly growing area of research with wide-ranging
applications in computer vision, graphics, robotics, and more. In order to streamline the …

Hexplane: A fast representation for dynamic scenes

A Cao, J Johnson - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Modeling and re-rendering dynamic 3D scenes is a challenging task in 3D vision. Prior
approaches build on NeRF and rely on implicit representations. This is slow since it requires …

Text-to-3d using gaussian splatting

Z Chen, F Wang, Y Wang, H Liu - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Automatic text-to-3D generation that combines Score Distillation Sampling (SDS) with the
optimization of volume rendering has achieved remarkable progress in synthesizing realistic …

Latent-nerf for shape-guided generation of 3d shapes and textures

G Metzer, E Richardson, O Patashnik… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-guided image generation has progressed rapidly in recent years, inspiring major
breakthroughs in text-guided shape generation. Recently, it has been shown that using …

Gaussian grouping: Segment and edit anything in 3d scenes

M Ye, M Danelljan, F Yu, L Ke - European Conference on Computer …, 2025 - Springer
Abstract The recent Gaussian Splatting achieves high-quality and real-time novel-view
synthesis of the 3D scenes. However, it is solely concentrated on the appearance and …

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Text2room: Extracting textured 3d meshes from 2d text-to-image models

L Höllein, A Cao, A Owens… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We present Text2Room, a method for generating room-scale textured 3D meshes
from a given text prompt as input. To this end, we leverage pre-trained 2D text-to-image …