F Mahdisoltani, G Berger, W Gharbieh, D Fleet… - arXiv e …, 2018 - ui.adsabs.harvard.edu
We describe a DNN for video classification and captioning, trained end-to-end, with shared
features, to solve tasks at different levels of granularity, exploring the link between …