Stylediffusion: Prompt-embedding inversion for text-based editing

S Li, J van de Weijer, T Hu, FS Khan, Q Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
A significant research effort is focused on exploiting the amazing capacities of pretrained
diffusion models for the editing of images. They either finetune the model, or invert the image …

TSingNet: Scale-aware and context-rich feature learning for traffic sign detection and recognition in the wild

Y Liu, J Peng, JH Xue, Y Chen, ZH Fu - Neurocomputing, 2021 - Elsevier
Traffic sign detection and recognition in the wild is a challenging task. Existing techniques
are often incapable of detecting small or occluded traffic signs because of the scale variation …

Language with vision: A study on grounded word and sentence embeddings

H Shahmohammadi, M Heitmeier… - Behavior Research …, 2024 - Springer
Grounding language in vision is an active field of research seeking to construct cognitively
plausible word and sentence representations by incorporating perceptual knowledge from …

A Precise Framework for Rice Leaf Disease Image–Text Retrieval Using FHTW-Net

H Zhou, Y Hu, S Liu, G Zhou, J Xu, A Chen… - Plant …, 2024 - spj.science.org
Cross-modal retrieval for rice leaf diseases is crucial for prevention, providing agricultural
experts with data-driven decision support to address disease threats and safeguard rice …

Diverse and styled image captioning using singular value decomposition‐based mixture of recurrent experts

M Heidari, M Ghatee, A Nickabadi… - Concurrency and …, 2022 - Wiley Online Library
With significant advances in vision and natural language processing, the generation of
image captions becomes a need. Mathews, Xie, and He extended a new model to generate …