Unidream: Unifying diffusion priors for relightable text-to-3d generation

Z Liu, Y Li, Y Lin, X Yu, S Peng, YP Cao, X Qi… - … on Computer Vision, 2025 - Springer
Recent advancements in text-to-3D generation technology have significantly advanced the
conversion of textual descriptions into imaginative well-geometrical and finely textured 3D …

Sv4d: Dynamic 3d content generation with multi-frame and multi-view consistency

Y Xie, CH Yao, V Voleti, H Jiang, V Jampani - arXiv preprint arXiv …, 2024 - arxiv.org
We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-
view consistent dynamic 3D content generation. Unlike previous methods that rely on …

Scaledreamer: Scalable text-to-3d synthesis with asynchronous score distillation

Z Ma, Y Wei, Y Zhang, X Zhu, Z Lei, L Zhang - European Conference on …, 2025 - Springer
By leveraging the text-to-image diffusion prior, score distillation can synthesize 3D contents
without paired text-3D training data. Instead of spending hours of online optimization per text …

G3r: Gradient guided generalizable reconstruction

Y Chen, J Wang, Z Yang, S Manivasagam… - … on Computer Vision, 2025 - Springer
Large scale 3D scene reconstruction is important for applications such as virtual reality and
simulation. Existing neural rendering approaches (eg, NeRF, 3DGS) have achieved realistic …

Cycle3d: High-quality and consistent image-to-3d generation via generation-reconstruction cycle

Z Tang, J Zhang, X Cheng, W Yu, C Feng… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent 3D large reconstruction models typically employ a two-stage process, including first
generate multi-view images by a multi-view diffusion model, and then utilize a feed-forward …

Geolrm: Geometry-aware large reconstruction model for high-quality 3d gaussian generation

C Zhang, H Song, Y Wei, Y Chen, J Lu… - arXiv preprint arXiv …, 2024 - arxiv.org
In this work, we introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an
approach which can predict high-quality assets with 512k Gaussians and 21 input images in …

Meshanything v2: Artist-created mesh generation with adjacent mesh tokenization

Y Chen, Y Wang, Y Luo, Z Wang, Z Chen, J Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce MeshAnything V2, an autoregressive transformer that generates Artist-Created
Meshes (AM) aligned to given shapes. It can be integrated with various 3D asset production …

Gaussianobject: High-quality 3d object reconstruction from four views with gaussian splatting

C Yang, S Li, J Fang, R Liang, L Xie, X Zhang… - ACM Transactions on …, 2024 - dl.acm.org
Reconstructing and rendering 3D objects from highly sparse views is of critical importance
for promoting applications of 3D vision techniques and improving user experience …

3dtopia-xl: Scaling high-quality 3d asset generation via primitive diffusion

Z Chen, J Tang, Y Dong, Z Cao, F Hong, Y Lan… - arXiv preprint arXiv …, 2024 - arxiv.org
The increasing demand for high-quality 3D assets across various industries necessitates
efficient and automated 3D content creation. Despite recent advancements in 3D generative …

Lvsm: A large view synthesis model with minimal 3d inductive bias

H Jin, H Jiang, H Tan, K Zhang, S Bi, T Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
We propose the Large View Synthesis Model (LVSM), a novel transformer-based approach
for scalable and generalizable novel view synthesis from sparse-view inputs. We introduce …