Recent research on remote sensing object detection has largely focused on improving the representation of oriented bounding boxes but has overlooked the unique prior knowledge …
Today's deep learning methods focus on how to design the most appropriate objective functions so that the prediction results of the model can be closest to the ground truth …
We launch EVA-02, a next-generation Transformer-based visual representation pre-trained to reconstruct strong and robust language-aligned vision features via masked image …
Large-kernel convolutional neural networks (ConvNets) have recently received extensive research attention but two unresolved and critical issues demand further investigation. 1) …
At the heart of foundation models is the philosophy of" more is different", exemplified by the astonishing success in computer vision and natural language processing. However, the …
Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural …
Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent models generalize to arbitrary images for their …
C Wang, W He, Y Nie, J Guo, C Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
In the past years, YOLO-series models have emerged as the leading approaches in the area of real-time object detection. Many studies pushed up the baseline to a higher level by …
In this paper, we summarize the 2nd NTIRE challenge on stereo image super-resolution (SR) with a focus on new solutions and results. The task of the challenge is to super-resolve …