Although no specific domain knowledge is considered in the design, plain vision transformers have shown excellent performance in visual recognition tasks. However, little …
Abstract We introduce Multi-view Ancestral Sampling (MAS) a method for 3D motion generation using 2D diffusion models that were trained on motions obtained from in-the-wild …
In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability in …
In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability in …
M Kholiavchenko, J Kline, M Ramirez… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present a novel dataset for animal behavior recognition collected in-situ using video from drones flown over the Mpala Research Centre in Kenya. Videos from DJI Mavic 2S …
Animal visual perception is an important technique for automatically monitoring animal health, understanding animal behaviors, and assisting animal-related research. However, it …
This work proposes a unified framework called UniPose to detect keypoints of any articulated (eg, human and animal), rigid, and soft objects via visual or textual prompts for …
Understanding the behavior of non-human primates is crucial for improving animal welfare, modeling social behavior, and gaining insights into distinctively human and phylogenetically …
Y Wang, H Xu, X Zhang, Z Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com
We provide a two-way integration for the widely-adopted ControlNet by integrating external condition generation algorithms into a single dense prediction method and by integrating its …