D Neimark, O Bar, M Zohar… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper presents VTN, a transformer-based framework for video recognition. Inspired by
recent developments in vision transformers, we ditch the standard approach in video action …