Due to the lack of depth cues in images, multi-frame inputs are important for the success of vision-based perception, prediction, and planning in autonomous driving. Observations from …
Robot decision-making increasingly relies on expressive data-driven human prediction models when operating around people. While these models are known to suffer from …