Infinitely deep bayesian neural networks with stochastic differential equations

W Xu, RTQ Chen, X Li… - … Conference on Artificial …, 2022 - proceedings.mlr.press
We perform scalable approximate inference in continuous-depth Bayesian neural networks.
In this model class, uncertainty about separate weights in each layer gives hidden units that …

Rank diminishing in deep neural networks

R Feng, K Zheng, Y Huang, D Zhao… - Advances in Neural …, 2022 - proceedings.neurips.cc
The rank of neural networks measures information flowing across layers. It is an instance of
a key structural condition that applies across broad domains of machine learning. In …

Neural tangent kernel analysis of deep narrow neural networks

J Lee, JY Choi, EK Ryu, A No - International Conference on …, 2022 - proceedings.mlr.press
The tremendous recent progress in analyzing the training dynamics of overparameterized
neural networks has primarily focused on wide networks and therefore does not sufficiently …

Clustering in pure-attention hardmax transformers and its role in sentiment analysis

A Alcalde, G Fantuzzi, E Zuazua - arXiv preprint arXiv:2407.01602, 2024 - arxiv.org
Transformers are extremely successful machine learning models whose mathematical
properties remain poorly understood. Here, we rigorously characterize the behavior of …

Neural Tangent Kernel Analysis of Deep Narrow Neural Networks

이종민 - 2023 - s-space.snu.ac.kr
The tremendous recent progress in analyzing the training dynamics of over parameterized
neural networks has primarily focused on wide networks and therefore does not sufficiently …

[图书][B] Methods for Bayesian Inference and Data Assimilation of Soil Biogeochemical Models

HW Xie - 2022 - search.proquest.com
Improving mechanistic understanding and prediction capabilities of long-term organic soil
system dynamics is a high priority for biogeochemists, soil scientists, and climate policy …

Gradient Explosion and Representation Shrinkage in Infinite Networks

A Klukowski - openreview.net
We study deep fully-connected neural networks using the mean field formalism, and carry
out a non-perturbative analysis of signal propagation. As a result, we demonstrate that …