CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos

N Karaev, I Makarov, J Wang, N Neverova… - arXiv preprint arXiv …, 2024 - arxiv.org
Most state-of-the-art point trackers are trained on synthetic data due to the difficulty of
annotating real videos for this task. However, this can result in suboptimal performance due …

Taptrv2: Attention-based position update improves tracking any point

H Li, H Zhang, S Liu, Z Zeng, F Li, T Ren, B Li… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we present TAPTRv2, a Transformer-based approach built upon TAPTR for
solving the Tracking Any Point (TAP) task. TAPTR borrows designs from DEtection …

X-pose: Detecting any keypoints

J Yang, A Zeng, R Zhang, L Zhang - European Conference on Computer …, 2025 - Springer
This work aims to address an advanced keypoint detection problem: how to accurately
detect any keypoints in complex real-world scenarios, which involves massive, messy, and …

Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation

H Jeong, CHP Huang, JC Ye, N Mitra… - arXiv preprint arXiv …, 2024 - arxiv.org
While recent foundational video generators produce visually rich output, they still struggle
with appearance drift, where objects gradually degrade or change inconsistently across …

DELTA: Dense Efficient Long-range 3D Tracking for any video

TD Ngo, P Zhuang, C Gan, E Kalogerakis… - arXiv preprint arXiv …, 2024 - arxiv.org
Tracking dense 3D motion from monocular videos remains challenging, particularly when
aiming for pixel-level precision over long sequences. We introduce\Approach, a novel …

TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

J Qu, H Li, S Liu, T Ren, Z Zeng, L Zhang - arXiv preprint arXiv:2411.18671, 2024 - arxiv.org
In this paper, we present TAPTRv3, which is built upon TAPTRv2 to improve its point
tracking robustness in long videos. TAPTRv2 is a simple DETR-like framework that can …

MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation

J Serych, M Neoral, J Matas - arXiv preprint arXiv:2411.09551, 2024 - arxiv.org
In this work, we present MFTIQ, a novel dense long-term tracking model that advances the
Multi-Flow Tracker (MFT) framework to address challenges in point-level visual tracking in …

DIG3D: Marrying Gaussian Splatting with Deformable Transformer for Single Image 3D Reconstruction

J Wu, K Liu, H Gao, X Jiang, L Zhang - arXiv preprint arXiv:2404.16323, 2024 - arxiv.org
In this paper, we study the problem of 3D reconstruction from a single-view RGB image and
propose a novel approach called DIG3D for 3D object reconstruction and novel view …

UniG: Modelling Unitary 3D Gaussians for View-consistent 3D Reconstruction

J Wu, K Liu, Y Shi, X Jiang, Y Yao, L Zhang - arXiv preprint arXiv …, 2024 - arxiv.org
In this work, we present UniG, a view-consistent 3D reconstruction and novel view synthesis
model that generates a high-fidelity representation of 3D Gaussians from sparse images …

Solution for Point Tracking Task of ECCV 2nd Perception Test Challenge 2024

Y Zhang, P Niu, K Yu, Q Chen, Y Yang - arXiv preprint arXiv:2410.16286, 2024 - arxiv.org
This report introduces an improved method for the Tracking Any Point~(TAP), focusing on
monitoring physical surfaces in video footage. Despite their success with short-sequence …