Make-a-story: Visual memory conditioned consistent story generation

T Rahman, HY Lee, J Ren, S Tulyakov… - Proceedings of the …, 2023 - openaccess.thecvf.com
There has been a recent explosion of impressive generative models that can produce high
quality images (or videos) conditioned on text descriptions. However, all such approaches …

Synthesizing coherent story with auto-regressive latent diffusion models

X Pan, P Qin, Y Li, H Xue… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis
capacity. Recently, most works focus on synthesizing independent images; While for real …

Boosting consistency in story visualization with rich-contextual conditional diffusion models

F Shen, H Ye, S Liu, J Zhang, C Wang, X Han… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent research showcases the considerable potential of conditional diffusion models for
generating consistent stories. However, current methods, which predominantly generate …

Intelligent Grimm-Open-ended Visual Storytelling via Latent Diffusion Models

C Liu, H Wu, Y Zhong, X Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Generative models have recently exhibited exceptional capabilities in text-to-image
generation but still struggle to generate image sequences coherently. In this work we focus …

AutoStory: Generating Diverse Storytelling Images with Minimal Human Efforts

W Wang, C Zhao, H Chen, Z Chen, K Zheng… - International Journal of …, 2024 - Springer
Story visualization aims to generate a series of images that match the story described in
texts, and it requires the generated images to satisfy high quality, alignment with the text …

Controllable text generation with neurally-decomposed oracle

T Meng, S Lu, N Peng… - Advances in Neural …, 2022 - proceedings.neurips.cc
We propose a general and efficient framework to control auto-regressive generation models
with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and …

Story visualization by online text augmentation with context memory

D Ahn, D Kim, G Song, SH Kim, H Lee… - Proceedings of the …, 2023 - openaccess.thecvf.com
Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not
only rendering visual details from the text descriptions but also encoding a longterm context …

Talecrafter: Interactive story visualization with multiple characters

Y Gong, Y Pang, X Cun, M Xia, Y He, H Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
Accurate Story visualization requires several necessary elements, such as identity
consistency across frames, the alignment between plain text and visual content, and a …

Survey: Transformer-based Models in Data Modality Conversion

E Rashno, A Eskandari, A Anand… - arXiv preprint arXiv …, 2024 - arxiv.org
Transformers have made significant strides across various artificial intelligence domains,
including natural language processing, computer vision, and audio processing. This …

Storyimager: A unified and efficient framework for coherent story visualization and completion

M Tao, BK Bao, H Tang, Y Wang, C Xu - European Conference on …, 2025 - Springer
Story visualization aims to generate a series of realistic and coherent images based on a
storyline. Current models adopt a frame-by-frame architecture by transforming the pre …