FlowFormer introduces a transformer architecture into optical flow estimation and achieves state-of-the-art performance. The core component of FlowFormer is the transformer-based …
We introduce Infinigen, a procedural generator of photorealistic 3D scenes of the natural world. Infinigen is entirely procedural: every asset, from shape to texture, is generated from …
Despite impressive performance for high-level downstream tasks, self-supervised pre- training methods have not yet fully delivered on dense geometric vision tasks such as stereo …
Optical flow, or the estimation of motion fields from image sequences, is one of the fundamental problems in computer vision. Unlike most pixel-wise tasks that aim at achieving …
Diffusion-based generative models are extremely effective in generating high-quality images, with generated samples often surpassing the quality of those produced by other …
Self-supervised learning of visual representations has been focusing on learning content features, which do not capture object motion or location, and focus on identifying and …
Recently, deep equilibrium models (DEQs) have drawn increasing attention from the machine learning community. However, DEQs are much less understood in terms of certified …
Cascaded computation, whereby predictions are recurrently refined over several stages, has been a persistent theme throughout the development of landmark detection models. In this …
S Wang, Y Teng, L Wang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Query-based object detectors directly decode image features into object instances with a set of learnable queries. These query vectors are progressively refined to stable meaningful …