Character-centric story visualization via visual planning and token alignment

T Rahman, HY Lee, J Ren, S Tulyakov… - Proceedings of the …, 2023 - openaccess.thecvf.com

There has been a recent explosion of impressive generative models that can produce high
quality images (or videos) conditioned on text descriptions. However, all such approaches …

被引用次数：54 相关文章所有 7 个版本

[PDF] thecvf.com

Synthesizing coherent story with auto-regressive latent diffusion models

X Pan, P Qin, Y Li, H Xue… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis
capacity. Recently, most works focus on synthesizing independent images; While for real …

被引用次数：59 相关文章所有 5 个版本

[PDF] arxiv.org

Boosting consistency in story visualization with rich-contextual conditional diffusion models

F Shen, H Ye, S Liu, J Zhang, C Wang, X Han… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent research showcases the considerable potential of conditional diffusion models for
generating consistent stories. However, current methods, which predominantly generate …

被引用次数：17 相关文章所有 3 个版本

[PDF] thecvf.com

Intelligent Grimm-Open-ended Visual Storytelling via Latent Diffusion Models

C Liu, H Wu, Y Zhong, X Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Generative models have recently exhibited exceptional capabilities in text-to-image
generation but still struggle to generate image sequences coherently. In this work we focus …

被引用次数：37 相关文章所有 4 个版本

AutoStory: Generating Diverse Storytelling Images with Minimal Human Efforts

W Wang, C Zhao, H Chen, Z Chen, K Zheng… - International Journal of …, 2024 - Springer

Story visualization aims to generate a series of images that match the story described in
texts, and it requires the generated images to satisfy high quality, alignment with the text …

被引用次数：12 相关文章所有 2 个版本

[PDF] neurips.cc

Controllable text generation with neurally-decomposed oracle

T Meng, S Lu, N Peng… - Advances in Neural …, 2022 - proceedings.neurips.cc

We propose a general and efficient framework to control auto-regressive generation models
with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and …

被引用次数：32 相关文章所有 6 个版本

[PDF] thecvf.com

Story visualization by online text augmentation with context memory

D Ahn, D Kim, G Song, SH Kim, H Lee… - Proceedings of the …, 2023 - openaccess.thecvf.com

Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not
only rendering visual details from the text descriptions but also encoding a longterm context …

被引用次数：6 相关文章所有 7 个版本

[PDF] arxiv.org

Talecrafter: Interactive story visualization with multiple characters

Y Gong, Y Pang, X Cun, M Xia, Y He, H Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

Accurate Story visualization requires several necessary elements, such as identity
consistency across frames, the alignment between plain text and visual content, and a …

被引用次数：35 相关文章所有 2 个版本

[PDF] arxiv.org

Survey: Transformer-based Models in Data Modality Conversion

E Rashno, A Eskandari, A Anand… - arXiv preprint arXiv …, 2024 - arxiv.org

Transformers have made significant strides across various artificial intelligence domains,
including natural language processing, computer vision, and audio processing. This …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Storyimager: A unified and efficient framework for coherent story visualization and completion

M Tao, BK Bao, H Tang, Y Wang, C Xu - European Conference on …, 2025 - Springer

Story visualization aims to generate a series of realistic and coherent images based on a
storyline. Current models adopt a frame-by-frame architecture by transforming the pre …

被引用次数：2 相关文章所有 2 个版本

高级搜索

QQ 群