Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com
This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

[HTML][HTML] Shifting machine learning for healthcare from development to deployment and from models to data

A Zhang, L Xing, J Zou, JC Wu - Nature Biomedical Engineering, 2022 - nature.com
In the past decade, the application of machine learning (ML) to healthcare has helped drive
the automation of physician tasks as well as enhancements in clinical capabilities and …

[HTML][HTML] Attention Is All You Need.(Nips), 2017

A Vaswani, N Shazeer, N Parmar, J Uszkoreit… - arXiv preprint arXiv …, 2017 - codetds.com
摘要占主导地位的序列转导模型基于复杂的递归或卷积神经网络, 包括编码器和解码器.
性能最好的模型还通过注意力机制连接编码器和解码器. 我们提出了一种新的简单网络架构 …

Motiondiffuse: Text-driven human motion generation with diffusion model

M Zhang, Z Cai, L Pan, F Hong, X Guo, L Yang… - arXiv preprint arXiv …, 2022 - arxiv.org
Human motion modeling is important for many modern graphics applications, which typically
require professional skills. In order to remove the skill barriers for laymen, recent motion …

Non-stationary transformers: Exploring the stationarity in time series forecasting

Y Liu, H Wu, J Wang, M Long - Advances in Neural …, 2022 - proceedings.neurips.cc
Transformers have shown great power in time series forecasting due to their global-range
modeling ability. However, their performance can degenerate terribly on non-stationary real …

Resurrecting recurrent neural networks for long sequences

A Orvieto, SL Smith, A Gu, A Fernando… - International …, 2023 - proceedings.mlr.press
Abstract Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are
hard to optimize and slow to train. Deep state-space models (SSMs) have recently been …

Seqtrack: Sequence to sequence learning for visual object tracking

X Chen, H Peng, D Wang, H Lu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In this paper, we present a new sequence-to-sequence learning framework for visual
tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem …

Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5)

S Geng, S Liu, Z Fu, Y Ge, Y Zhang - … of the 16th ACM Conference on …, 2022 - dl.acm.org
For a long time, different recommendation tasks require designing task-specific architectures
and training objectives. As a result, it is hard to transfer the knowledge and representations …

Query-centric trajectory prediction

Z Zhou, J Wang, YH Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Predicting the future trajectories of surrounding agents is essential for autonomous vehicles
to operate safely. This paper presents QCNet, a modeling framework toward pushing the …

SuperFusion: A versatile image registration and fusion network with semantic awareness

L Tang, Y Deng, Y Ma, J Huang… - IEEE/CAA Journal of …, 2022 - ieeexplore.ieee.org
Image fusion aims to integrate complementary information in source images to synthesize a
fused image comprehensively characterizing the imaging scene. However, existing image …