Deep 360 pilot: Learning a deep agent for piloting through 360deg sports videos

From image to language: A critical analysis of visual question answering (vqa) approaches, challenges, and opportunities

MF Ishmam, MSH Shovon, MF Mridha, N Dey - Information Fusion, 2024 - Elsevier

The multimodal task of Visual Question Answering (VQA) encompassing elements of
Computer Vision (CV) and Natural Language Processing (NLP), aims to generate answers …

被引用次数：9 相关文章所有 2 个版本

[PDF] neurips.cc

Learning spherical convolution for fast features from 360 imagery

YC Su, K Grauman - Advances in neural information …, 2017 - proceedings.neurips.cc

While 360 cameras offer tremendous new possibilities in vision, graphics, and augmented
reality, the spherical images they produce make core feature extraction non-trivial …

被引用次数：320 相关文章所有 7 个版本

[PDF] thecvf.com

Saltinet: Scan-path prediction on 360 degree images using saliency volumes

M Assens Reina, X Giro-i-Nieto… - Proceedings of the …, 2017 - openaccess.thecvf.com

We introduce SaltiNet, a deep neural network for scanpath prediction trained on 360-degree
images. The model is based on a temporal-aware novel representation of saliency …

被引用次数：136 相关文章所有 14 个版本

[PDF] aaai.org

A spherical convolution approach for learning long term viewport prediction in 360 immersive video

C Wu, R Zhang, Z Wang, L Sun - … of the AAAI Conference on Artificial …, 2020 - ojs.aaai.org

Viewport prediction for 360 video forecasts a viewer's viewport when he/she watches a 360
video with a head-mounted display, which benefits many VR/AR applications such as 360 …

被引用次数：32 相关文章所有 5 个版本

[PDF] thecvf.com

DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction

J Xiong, P Zhang, T You, C Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

Audio-visual saliency prediction can draw support from diverse modality complements but
further performance enhancement is still challenged by customized architectures as well as …

被引用次数：1 相关文章所有 3 个版本

[PDF] aaai.org

Self-view grounding given a narrated 360 video

SH Chou, YC Chen, KH Zeng, HN Hu, J Fu… - Proceedings of the AAAI …, 2018 - ojs.aaai.org

Narrated 360 videos are typically provided in many touring scenarios to mimic real-world
experience. However, previous work has shown that smart assistance (ie, providing visual …

被引用次数：24 相关文章所有 7 个版本

[PDF] ieee.org

Descriptor matching for a discrete spherical image with a convolutional neural network

Y Shan, S Li - IEEE Access, 2018 - ieeexplore.ieee.org

In this paper, we propose a method of extracting feature descriptors from discrete spherical
images using convolutional neural networks (CNNs). First, a captured full-view image is …

被引用次数：21 相关文章所有 4 个版本

[PDF] upc.edu

Scanpath and saliency prediction on 360 degree images

M Assens, X Giro-i-Nieto, K McGuinness… - Signal Processing …, 2018 - Elsevier

We introduce deep neural networks for scanpath and saliency prediction trained on 360-
degree images. The scanpath prediction model called SaltiNet is based on a temporal …

被引用次数：21 相关文章所有 6 个版本

An Integrated System for Spatio-temporal Summarization of 360-Degrees Videos

I Kontostathis, E Apostolidis, V Mezaris - International Conference on …, 2024 - Springer

In this work, we present an integrated system for spatio-temporal summarization of 360-
degrees videos. The video summary production involves the detection of salient events in …

被引用次数：1 相关文章所有 4 个版本

Predicting 360° Video Saliency: A ConvLSTM Encoder-Decoder Network with Spatio-temporal Consistency

Z Wan, H Qin, R Xiong, Z Li, X Fan… - IEEE Journal on …, 2024 - ieeexplore.ieee.org

360° videos have been widely used with the development of virtual reality technology and
triggered a demand to determine the most visually attractive objects in them, aka 360° video …

高级搜索

QQ 群