Textdiffuser: Diffusion models as text painters

J Chen, Y Huang, T Lv, L Cui… - Advances in Neural …, 2024 - proceedings.neurips.cc
Diffusion models have gained increasing attention for their impressive generation abilities
but currently struggle with rendering accurate and coherent text. To address this issue, we …

A Survey of AI Techniques in IoT Applications with Use Case Investigations in the Smart Environmental Monitoring and Analytics in Real-Time IoT Platform

YYF Panduman, N Funabiki, ED Fajrianti, S Fang… - Information, 2024 - mdpi.com
In this paper, we have developed the SEMAR (Smart Environmental Monitoring and
Analytics in Real-Time) IoT application server platform for fast deployments of IoT …

TRINS: Towards Multimodal Language Models that Can Read

R Zhang, Y Zhang, J Chen, Y Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com
Large multimodal language models have shown remarkable proficiency in understanding
and editing images. However a majority of these visually-tuned models struggle to …

A low-cost, high-performance middleware solution for unified parking management

Y Wang, D Liu, X Sun - Soft Computing, 2024 - Springer
The parking system is a vital element within intelligent transport systems, playing a pivotal
role in alleviating traffic congestion and mitigating associated stress and anxiety. In …

ARTIST: Improving the Generation of Text-rich Images by Disentanglement

J Zhang, Y Zhou, J Gu, C Wigington, T Yu… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models have demonstrated exceptional capabilities in generating a broad
spectrum of visual content, yet their proficiency in rendering text is still limited: they often …

Optical Character Recognition Using Optimized Convolutional Networks*

A Nawaz, M Irfan, T Westerlund - 2023 Eighth International …, 2023 - ieeexplore.ieee.org
Optical Character Recognition (OCR) has been a prominent area of research in pattern
recognition for several decades, owing to its broad application potential in smart living. To …

Generative Data Augmentation for Arabic Handwritten Digit Recognition Boosting Real-time OCR Capabilities

M Memari, KR Ahmed, S Rahimi - Proceedings of the 2023 6th …, 2023 - dl.acm.org
This study assesses the effectiveness of Generative Adversarial Networks (GANs) and
Variational Autoencoders (VAEs) in enhancing Optical Character Recognition (OCR) …

Multi-granularity and Multi-modal Feature Fusion for Indoor Positioning

S Pei, Y Wang, H Zhao, Y Wang - 2024 - researchsquare.com
Despite the widespread use of indoor positioning technology, current WiFi positioning
methods face challenges in accuracy and efficiency due to the complexities of indoor …