We present the latest generation of MobileNets: MobileNetV4 (MNv4). They feature universally-efficient architecture designs for mobile devices. We introduce the Universal …
Large-kernel convolutional neural networks (ConvNets) have recently received extensive research attention but two unresolved and critical issues demand further investigation. 1) …
Contrastive pre-training of image-text foundation models such as CLIP demonstrated excellent zero-shot performance and improved robustness on a wide range of downstream …
Anomaly detection has recently gained increasing attention in the field of computer vision, likely due to its broad set of applications ranging from product fault detection on industrial …
S Yun, Y Ro - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Abstract Recently efficient Vision Transformers have shown great performance with low latency on resource-constrained devices. Conventionally they use 4x4 patch embeddings …
Vision Transformers (ViTs) have achieved impressive performance over various computer vision tasks. However, modeling global correlations with multi-head self-attention (MSA) …
On-device machine learning (ML) promises to improve the privacy, responsiveness, and proliferation of new, intelligent user experiences by moving ML computation onto everyday …
Real-time semantic segmentation is a crucial component of autonomous driving systems, where accurate and efficient scene interpretation is essential to ensure both safety and …
Diffusion Transformers (DiT) excel at image and video generation but face computational challenges due to the quadratic complexity of self-attention operators. We propose …