Residual connections encourage iterative inference

K Ahn, X Cheng, H Daneshmand… - Advances in Neural …, 2023 - proceedings.neurips.cc

Several recent works demonstrate that transformers can implement algorithms like gradient
descent. By a careful construction of weights, these works show that multiple layers of …

被引用次数：107 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Going in circles is the way forward: the role of recurrence in visual inference

RS van Bergen, N Kriegeskorte - Current Opinion in Neurobiology, 2020 - Elsevier

Highlights•Neural network models of vision are dominated by feedforward architectures.•
Biological vision, by contrast, exhibits abundant recurrent processing.•The computational …

被引用次数：92 相关文章所有 12 个版本

[PDF] arxiv.org

Representation engineering: A top-down approach to ai transparency

A Zou, L Phan, S Chen, J Campbell, P Guo… - arXiv preprint arXiv …, 2023 - arxiv.org

In this paper, we identify and characterize the emerging area of representation engineering
(RepE), an approach to enhancing the transparency of AI systems that draws on insights …

被引用次数：167 相关文章所有 2 个版本

[PDF] arxiv.org

Eliciting latent predictions from transformers with the tuned lens

N Belrose, Z Furman, L Smith, D Halawi… - arXiv preprint arXiv …, 2023 - arxiv.org

We analyze transformers from the perspective of iterative inference, seeking to understand
how model predictions are refined layer by layer. To do so, we train an affine probe for each …

被引用次数：101 相关文章所有 2 个版本

[PDF] arxiv.org

A disciplined approach to neural network hyper-parameters: Part 1--learning rate, batch size, momentum, and weight decay

LN Smith - arXiv preprint arXiv:1803.09820, 2018 - arxiv.org

Although deep learning has produced dazzling successes for applications of image, speech,
and video processing in the past few years, most trainings are with suboptimal hyper …

被引用次数：1348 相关文章所有 2 个版本

[PDF] thecvf.com

Codeslam—learning a compact, optimisable representation for dense visual slam

M Bloesch, J Czarnowski, R Clark… - Proceedings of the …, 2018 - openaccess.thecvf.com

The representation of geometry in real-time 3D perception systems continues to be a critical
research issue. Dense maps capture complete surface shape and can be augmented with …

被引用次数：425 相关文章所有 10 个版本

[PDF] arxiv.org

Architecture matters in continual learning

SI Mirzadeh, A Chaudhry, D Yin, T Nguyen… - arXiv preprint arXiv …, 2022 - arxiv.org

A large body of research in continual learning is devoted to overcoming the catastrophic
forgetting of neural networks by designing new algorithms that are robust to the distribution …

被引用次数：66 相关文章所有 2 个版本

[PDF] arxiv.org

Toward fast and accurate human pose estimation via soft-gated skip connections

A Bulat, J Kossaifi, G Tzimiropoulos… - 2020 15th IEEE …, 2020 - ieeexplore.ieee.org

This paper is on highly accurate and highly efficient human pose estimation. Recent works
based on Fully Convolutional Networks (FCNs) have demonstrated excellent results for this …

被引用次数：121 相关文章所有 9 个版本

[PDF] arxiv.org

Multi-level residual networks from dynamical systems view

B Chang, L Meng, E Haber, F Tung… - arXiv preprint arXiv …, 2017 - arxiv.org

Deep residual networks (ResNets) and their variants are widely used in many computer
vision applications and natural language processing tasks. However, the theoretical …

被引用次数：190 相关文章所有 3 个版本

[PDF] neurips.cc

Brain-like object recognition with high-performing shallow recurrent ANNs

J Kubilius, M Schrimpf, K Kar… - Advances in neural …, 2019 - proceedings.neurips.cc

Deep convolutional artificial neural networks (ANNs) are the leading class of candidate
models of the mechanisms of visual processing in the primate ventral stream. While initially …

被引用次数：282 相关文章所有 9 个版本

高级搜索

QQ 群