Text to image generation with semantic-spatial aware gan

L Xu, Q Tang, J Lv, B Zheng, X Zeng, W Li - Neurocomputing, 2023 - Elsevier

Image captioning, also called report generation in medical field, aims to describe visual
content of images in human language, which requires to model semantic relationship …

被引用次数：13 相关文章所有 2 个版本

[HTML] sciencedirect.com

[HTML][HTML] A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation

AA Khan, O Chaudhari, R Chandra - Expert Systems with Applications, 2023 - Elsevier

Class imbalance (CI) in classification problems arises when the number of observations
belonging to one class is lower than the other. Ensemble learning combines multiple models …

被引用次数：24 相关文章所有 4 个版本

[PDF] thecvf.com

Toward verifiable and reproducible human evaluation for text-to-image generation

M Otani, R Togashi, Y Sawai… - Proceedings of the …, 2023 - openaccess.thecvf.com

Human evaluation is critical for validating the performance of text-to-image generative
models, as this highly cognitive process requires deep comprehension of text and images …

被引用次数：42 相关文章所有 6 个版本

[PDF] aaai.org

Frido: Feature pyramid diffusion for complex scene image synthesis

WC Fan, YC Chen, DD Chen, Y Cheng… - Proceedings of the …, 2023 - ojs.aaai.org

Diffusion models (DMs) have shown great potential for high-quality image synthesis.
However, when it comes to producing images with complex scenes, how to properly …

被引用次数：60 相关文章所有 4 个版本

[PDF] neurips.cc

Textdiffuser: Diffusion models as text painters

J Chen, Y Huang, T Lv, L Cui… - Advances in Neural …, 2024 - proceedings.neurips.cc

Diffusion models have gained increasing attention for their impressive generation abilities
but currently struggle with rendering accurate and coherent text. To address this issue, we …

被引用次数：26 相关文章所有 4 个版本

[PDF] arxiv.org

Taming encoder for zero fine-tuning image customization with text-to-image diffusion models

X Jia, Y Zhao, KCK Chan, Y Li, H Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

This paper proposes a method for generating images of customized objects specified by
users. The method is based on a general framework that bypasses the lengthy optimization …

被引用次数：60 相关文章所有 2 个版本

[PDF] thecvf.com

Scenecomposer: Any-level semantic image synthesis

Y Zeng, Z Lin, J Zhang, Q Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose a new framework for conditional image synthesis from semantic layouts of any
precision levels, ranging from pure text to a 2D semantic canvas with precise shapes. More …

被引用次数：29 相关文章所有 6 个版本

[PDF] thecvf.com

Shape-aware text-driven layered video editing

YC Lee, JZG Jang, YT Chen, E Qiu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Temporal consistency is essential for video editing applications. Existing work on layered
representation of videos allows propagating edits consistently to each frame. These …

被引用次数：22 相关文章所有 5 个版本

[PDF] thecvf.com

Text-IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image Fusion

X Yi, H Xu, H Zhang, L Tang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Image fusion aims to combine information from different source images to create a
comprehensively representative image. Existing fusion methods are typically helpless in …

被引用次数：2 相关文章所有 3 个版本

A comprehensive survey on generative adversarial networks used for synthesizing multimedia content

L Kumar, DK Singh - Multimedia Tools and Applications, 2023 - Springer

GAN's are playing an important role in creating and generating a new set of data from the
previously available content. GAN models are impressive in the results for image and video …

被引用次数：18 相关文章所有 2 个版本

高级搜索

QQ 群