Occlusion aware facial expression recognition using CNN with attention mechanism

Y Li, J Zeng, S Shan, X Chen - IEEE Transactions on Image …, 2018 - ieeexplore.ieee.org
Facial expression recognition in the wild is challenging due to various unconstrained
conditions. Although existing facial expression classifiers have been almost perfect on …

Clevrer: Collision events for video representation and reasoning

K Yi, C Gan, Y Li, P Kohli, J Wu, A Torralba… - arXiv preprint arXiv …, 2019 - arxiv.org
The ability to reason about temporal and causal events from videos lies at the core of human
intelligence. Most video reasoning benchmarks, however, focus on pattern recognition from …

Neural-symbolic vqa: Disentangling reasoning from vision and language understanding

K Yi, J Wu, C Gan, A Torralba, P Kohli… - Advances in neural …, 2018 - proceedings.neurips.cc
We marry two powerful ideas: deep representation learning for visual recognition and
language understanding, and symbolic program execution for reasoning. Our neural …

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …

Camp: Cross-modal adaptive message passing for text-image retrieval

Z Wang, X Liu, H Li, L Sheng, J Yan… - Proceedings of the …, 2019 - openaccess.thecvf.com
Text-image cross-modal retrieval is a challenging task in the field of language and vision.
Most previous approaches independently embed images and sentences into a joint …

Relation-aware graph attention network for visual question answering

L Li, Z Gan, Y Cheng, J Liu - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com
In order to answer semantically-complicated questions about an image, a Visual Question
Answering (VQA) model needs to fully understand the visual scene in the image, especially …

A survey of vehicle re-identification based on deep learning

H Wang, J Hou, N Chen - IEEE Access, 2019 - ieeexplore.ieee.org
Vehicle re-identification is one of the core technologies of intelligent transportation systems,
and it is crucial for the construction of smart cities. With the rapid development of deep …

Dynamic fusion with intra-and inter-modality attention flow for visual question answering

P Gao, Z Jiang, H You, P Lu… - Proceedings of the …, 2019 - openaccess.thecvf.com
Learning effective fusion of multi-modality features is at the heart of visual question
answering. We propose a novel method of dynamically fuse multi-modal features with intra …

Aware attentive multi-view inference for vehicle re-identification

Y Zhou, L Shao - Proceedings of the IEEE conference on …, 2018 - openaccess.thecvf.com
Vehicle re-identification (re-ID) has the huge potential to contribute to the intelligent video
surveillance. However, it suffers from challenges that different vehicle identities with a similar …

Raven: A dataset for relational and analogical visual reasoning

C Zhang, F Gao, B Jia, Y Zhu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Dramatic progress has been witnessed in basic vision tasks involving low-level perception,
such as object recognition, detection, and tracking. Unfortunately, there is still enormous …