A comprehensive survey on model compression and acceleration

T Choudhary, V Mishra, A Goswami… - Artificial Intelligence …, 2020 - Springer
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable
improvement in computer vision, natural language processing, stock prediction, forecasting …

A state-of-the-art review on machine learning-based multiscale modeling, simulation, homogenization and design of materials

D Bishara, Y Xie, WK Liu, S Li - Archives of computational methods in …, 2023 - Springer
Multiscale simulation and homogenization of materials have become the major
computational technology as well as engineering tools in material modeling and material …

R-drop: Regularized dropout for neural networks

L Wu, J Li, Y Wang, Q Meng, T Qin… - Advances in …, 2021 - proceedings.neurips.cc
Dropout is a powerful and widely used technique to regularize the training of deep neural
networks. Though effective and performing well, the randomness introduced by dropout …

[HTML][HTML] Temporal fusion transformers for interpretable multi-horizon time series forecasting

B Lim, SÖ Arık, N Loeff, T Pfister - International Journal of Forecasting, 2021 - Elsevier
Multi-horizon forecasting often contains a complex mix of inputs–including static (ie time-
invariant) covariates, known future inputs, and other exogenous time series that are only …

[引用][C] An introduction to variational autoencoders

DP Kingma, M Welling - Foundations and Trends® in …, 2019 - nowpublishers.com
An Introduction to Variational Autoencoders Page 1 An Introduction to Variational Autoencoders
Page 2 Other titles in Foundations and Trends R in Machine Learning Computational Optimal …

Deep equilibrium models

S Bai, JZ Kolter, V Koltun - Advances in neural information …, 2019 - proceedings.neurips.cc
We present a new approach to modeling sequential data: the deep equilibrium model
(DEQ). Motivated by an observation that the hidden layers of many existing deep sequence …

Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers

B Wang, R Shin, X Liu, O Polozov… - arXiv preprint arXiv …, 2019 - arxiv.org
When translating natural language questions into SQL queries to answer questions from a
database, contemporary semantic parsing models struggle to generalize to unseen …

Dropblock: A regularization method for convolutional networks

G Ghiasi, TY Lin, QV Le - Advances in neural information …, 2018 - proceedings.neurips.cc
Deep neural networks often work well when they are over-parameterized and trained with a
massive amount of noise and regularization, such as weight decay and dropout. Although …

From recognition to cognition: Visual commonsense reasoning

R Zellers, Y Bisk, A Farhadi… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Visual understanding goes well beyond object recognition. With one glance at an image, we
can effortlessly imagine the world beyond the pixels: for instance, we can infer people's …

Freelb: Enhanced adversarial training for natural language understanding

C Zhu, Y Cheng, Z Gan, S Sun, T Goldstein… - arXiv preprint arXiv …, 2019 - arxiv.org
Adversarial training, which minimizes the maximal risk for label-preserving input
perturbations, has proved to be effective for improving the generalization of language …