Multimodal intelligence: Representation learning, information fusion, and applications

C Zhang, Z Yang, X He, L Deng - IEEE Journal of Selected …, 2020 - ieeexplore.ieee.org
Deep learning methods haverevolutionized speech recognition, image recognition, and
natural language processing since 2010. Each of these tasks involves a single modality in …

Predicting industrial building energy consumption with statistical and machine-learning models informed by physical system parameters

S Kapp, JK Choi, T Hong - Renewable and Sustainable Energy Reviews, 2023 - Elsevier
The industrial sector consumes about one-third of global energy, making them a frequent
target for energy use reduction. Variation in energy usage is observed with weather …

Fine-grained image analysis with deep learning: A survey

XS Wei, YZ Song, O Mac Aodha, J Wu… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer
vision and pattern recognition, and underpins a diverse set of real-world applications. The …

Mixed high-order attention network for person re-identification

B Chen, W Deng, J Hu - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com
Attention has become more attractive in person re-identification (ReID) as it is capable of
biasing the allocation of available resources towards the most informative parts of an input …

Randomized numerical linear algebra: Foundations and algorithms

PG Martinsson, JA Tropp - Acta Numerica, 2020 - cambridge.org
This survey describes probabilistic algorithms for linear algebraic computations, such as
factorizing matrices and solving linear systems. It focuses on techniques that have a proven …

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …

Part-aligned bilinear representations for person re-identification

Y Suh, J Wang, S Tang, T Mei… - Proceedings of the …, 2018 - openaccess.thecvf.com
Comparing the appearance of corresponding body parts is essential for person re-
identification. As body parts are frequently misaligned between the detected human boxes …

Multimodal compact bilinear pooling for visual question answering and visual grounding

A Fukui, DH Park, D Yang, A Rohrbach… - arXiv preprint arXiv …, 2016 - arxiv.org
Modeling textual or visual information with vector representations trained from large
language or visual datasets has been successfully explored in recent years. However, tasks …

Hadamard product for low-rank bilinear pooling

JH Kim, KW On, W Lim, J Kim, JW Ha… - arXiv preprint arXiv …, 2016 - arxiv.org
Bilinear models provide rich representations compared with linear models. They have been
applied in various visual tasks, such as object recognition, segmentation, and visual …

Hierarchical bilinear pooling for fine-grained visual recognition

C Yu, X Zhao, Q Zheng, P Zhang… - Proceedings of the …, 2018 - openaccess.thecvf.com
Fine-grained visual recognition is challenging because it highly relies on the modeling of
various semantic parts and fine-grained feature learning. Bilinear pooling based models …