Bootstap: Bootstrapped training for tracking-any-point

C Doersch, P Luc, Y Yang, D Gokay… - Proceedings of the …, 2024 - openaccess.thecvf.com
To endow models with greater understanding of physics and motion, it is useful to enable
them to perceive how solid surfaces move and deform in real scenes. This can be formalized …

Tapvid-3d: A benchmark for tracking any point in 3d

S Koppula, I Rocco, Y Yang, J Heyward… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce a new benchmark, TAPVid-3D, for evaluating the task of long-range Tracking
Any Point in 3D (TAP-3D). While point tracking in two dimensions (TAP) has many …

DELTA: Dense Efficient Long-range 3D Tracking for any video

TD Ngo, P Zhuang, C Gan, E Kalogerakis… - arXiv preprint arXiv …, 2024 - arxiv.org
Tracking dense 3D motion from monocular videos remains challenging, particularly when
aiming for pixel-level precision over long sequences. We introduce\Approach, a novel …

EgoPoints: Advancing Point Tracking for Egocentric Videos

A Darkhalil, R Guerrier, AW Harley… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce EgoPoints, a benchmark for point tracking in egocentric videos. We annotate
4.7 K challenging tracks in egocentric sequences. Compared to the popular TAP-Vid-DAVIS …

Pixel-Level Tracking and Future Prediction in Video Streams

G Le Moing - 2024 - hal.science
Visual cues play a significant role for people in foreseeing (plausible) future events, a
fundamental skill that aids in social interactions, object manipulation, navigation, and …