Dettoolchain: A new prompting paradigm to unleash detection ability of mllm

Y Wu, Y Wang, S Tang, W Wu, T He, W Ouyang… - … on Computer Vision, 2025 - Springer
We present DetToolChain, a novel prompting paradigm, to unleash the zero-shot object
detection ability of multimodal large language models (MLLMs), such as GPT-4V and …

SimDETR: Simplifying self-supervised pretraining for DETR

IM Metaxas, A Bulat, I Patras, B Martinez… - arXiv preprint arXiv …, 2023 - arxiv.org
DETR-based object detectors have achieved remarkable performance but are sample-
inefficient and exhibit slow convergence. Unsupervised pretraining has been found to be …

SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers

G Jin, F Yang, M Sun, R Zhao, Y Liu, W Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Self-supervised pre-training and transformer-based networks have significantly improved
the performance of object detection. However, most of the current self-supervised object …

[PDF][PDF] Supplementary Material SeqCo-DETR: Sequence Consistency Training for Self-supervised Object Detection with Transformers

G Jin, F Yang, M Sun, R Zhao, Y Liu, W Li, T Bao, L Wu… - 2023 - bmvc2022.mpi-inf.mpg.de
We organize the supplementary material as follows. The implementation details are given in
Sec. 2. More results and the ablation study are presented in Sec. 3. Then, we visualize the …

Simplifying Self-Supervised Object Detection Pretraining

IM Metaxas, A Bulat, I Patras, B Martinez… - openreview.net
Object detectors are often trained by first training the backbone in a self-supervised manner
and then fine-tuning the whole model on annotated data. An unsupervised detector …