Augmented behavioral annotation tools, with application to multimodal datasets and models: a systematic review

E Watson, T Viana, S Zhang - AI, 2023 - mdpi.com
Annotation tools are an essential component in the creation of datasets for machine learning
purposes. Annotation tools have evolved greatly since the turn of the century, and now …

Diffusion-based visual art creation: A survey and new perspectives

B Wang, Q Chen, Z Wang - arXiv preprint arXiv:2408.12128, 2024 - arxiv.org
The integration of generative AI in visual art has revolutionized not only how visual content is
created but also how AI interacts with and reflects the underlying domain knowledge. This …

SWF-GAN: A Text-to-Image model based on sentence–word fusion perception

C Liu, J Hu, H Lin - Computers & Graphics, 2023 - Elsevier
Synthesizing images from descriptive text is an exciting and challenging task in multimodal
deep learning, which has broad prospects of application in the fields of visual reasoning …

Automatic Generation of Multimedia Teaching Materials Based on Generative AI: Taking Tang Poetry as an Example

X Chen, D Wu - IEEE Transactions on Learning Technologies, 2024 - ieeexplore.ieee.org
Generative artificial intelligence (AI) is widely recognized as one of the most influential
technologies for the future, having sparked a paradigm shift in scientific research. The field …

Image2text2image: A novel framework for label-free evaluation of image-to-text generation with text-to-image diffusion models

JH Huang, H Zhu, Y Shen, S Rudinac… - … on Multimedia Modeling, 2025 - Springer
Evaluating the quality of automatically generated image descriptions is a complex task that
requires metrics capturing various dimensions, such as grammaticality, coverage, accuracy …

Ada-HGNN: Adaptive Sampling for Scalable Hypergraph Neural Networks

S Wang, DW Zhang, JH Huang, S Rudinac… - arXiv preprint arXiv …, 2024 - arxiv.org
Hypergraphs serve as an effective model for depicting complex connections in various real-
world scenarios, from social to biological networks. The development of Hypergraph Neural …

DeepArt: A Benchmark to Advance Fidelity Research in AI-Generated Content

W Wang, X Huang, T Wang, SK Roy - arXiv preprint arXiv:2312.10407, 2023 - arxiv.org
This paper explores the image synthesis capabilities of GPT-4, a leading multi-modal large
language model. We establish a benchmark for evaluating the fidelity of texture features in …

Poetry in Pixels: Prompt Tuning for Poem Image Generation via Diffusion Models

S Jamil, BA Reddy, R Kumar, S Saha… - arXiv preprint arXiv …, 2025 - arxiv.org
The task of text-to-image generation has encountered significant challenges when applied
to literary works, especially poetry. Poems are a distinct form of literature, with meanings that …

FHS-adapter: fine-grained hierarchical semantic adapter for Chinese landscape paintings generation

X Peng, Q Hu, F Fan, P Xie, Y Zhang, R Cao - Heritage Science, 2024 - Springer
How to migrate text-to-image models based on pre-trained diffusion models to adapt them to
domain generation tasks is a common problem. In particular, the generation task for Chinese …

EEBO-Verse: Sifting for Poetry in Large Early Modern Corpora Using Visual Features

D Chen, N Jiang, T Berg-Kirkpatrick - International Conference on …, 2023 - Springer
One branch of important digital humanities research focuses on the study of poetry and
verse, leveraging large corpora to reveal patterns and trends. However, this work is limited …