What is one grain of sand in the desert? analyzing individual neurons in deep nlp models

Y Zhang, P Tiňo, A Leonardis… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Along with the great success of deep neural networks, there is also growing concern about
their black-box nature. The interpretability issue affects people's trust on deep learning …

被引用次数：738 相关文章所有 6 个版本

[PDF] mit.edu

Analysis methods in neural language processing: A survey

Y Belinkov, J Glass - … of the Association for Computational Linguistics, 2019 - direct.mit.edu

The field of natural language processing has seen impressive progress in recent years, with
neural network models replacing many of the traditional systems. A plethora of new models …

被引用次数：584 相关文章所有 14 个版本

[PDF] acm.org

Explainability for large language models: A survey

H Zhao, H Chen, F Yang, N Liu, H Deng, H Cai… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …

被引用次数：250 相关文章所有 5 个版本

[PDF] arxiv.org

Transformer feed-forward layers are key-value memories

M Geva, R Schuster, J Berant, O Levy - arXiv preprint arXiv:2012.14913, 2020 - arxiv.org

Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role
in the network remains under-explored. We show that feed-forward layers in transformer …

被引用次数：495 相关文章所有 6 个版本

[PDF] neurips.cc

How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model

M Hanna, O Liu, A Variengien - Advances in Neural …, 2024 - proceedings.neurips.cc

Pre-trained language models can be surprisingly adept at tasks they were not explicitly
trained on, but how they implement these capabilities is poorly understood. In this paper, we …

被引用次数：64 相关文章所有 5 个版本

[PDF] arxiv.org

Formalizing trust in artificial intelligence: Prerequisites, causes and goals of human trust in AI

A Jacovi, A Marasović, T Miller… - Proceedings of the 2021 …, 2021 - dl.acm.org

Trust is a central component of the interaction between people and AI, in that'incorrect'levels
of trust may cause misuse, abuse or disuse of the technology. But what, precisely, is the …

被引用次数：485 相关文章所有 5 个版本

[PDF] arxiv.org

Finding neurons in a haystack: Case studies with sparse probing

W Gurnee, N Nanda, M Pauly, K Harvey… - arXiv preprint arXiv …, 2023 - arxiv.org

Despite rapid adoption and deployment of large language models (LLMs), the internal
computations of these models remain opaque and poorly understood. In this work, we seek …

被引用次数：81 相关文章所有 3 个版本

[PDF] jair.org

Neural machine translation: A review

F Stahlberg - Journal of Artificial Intelligence Research, 2020 - jair.org

The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …

被引用次数：388 相关文章所有 7 个版本

[PDF] openreview.net

Natural language descriptions of deep visual features

E Hernandez, S Schwettmann, D Bau… - International …, 2021 - openreview.net

Some neurons in deep networks specialize in recognizing highly specific perceptual,
structural, or semantic features of inputs. In computer vision, techniques exist for identifying …

被引用次数：110 相关文章所有 5 个版本

[PDF] mlr.press

Head2toe: Utilizing intermediate representations for better transfer learning

U Evci, V Dumoulin, H Larochelle… - … on Machine Learning, 2022 - proceedings.mlr.press

Transfer-learning methods aim to improve performance in a data-scarce target domain using
a model pretrained on a data-rich source domain. A cost-efficient strategy, linear probing …

被引用次数：79 相关文章所有 5 个版本

高级搜索

QQ 群