Representation engineering: A top-down approach to AI transparency. CoRR, abs/2310.01405,...- 学术资源搜索

文章

学术资源搜索

Representation engineering: A top-down approach to ai transparency

A Zou, L Phan, S Chen, J Campbell, P Guo… - arXiv preprint arXiv …, 2023 - arxiv.org

In this paper, we identify and characterize the emerging area of representation engineering
(RepE), an approach to enhancing the transparency of AI systems that draws on insights
from cognitive neuroscience. RepE places population-level representations, rather than
neurons or circuits, at the center of analysis, equipping us with novel methods for monitoring
and manipulating high-level cognitive phenomena in deep neural networks (DNNs). We
provide baselines and an initial analysis of RepE techniques, showing that they offer simple …

被引用次数：280 相关文章所有 2 个版本

[引用][C] Representation engineering: A top-down approach to AI transparency. CoRR, abs/2310.01405, 2023. doi: 10.48550

A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren… - arXiv preprint ARXIV.2310.01405

被引用次数：20 相关文章

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Representation engineering: A top-down approach to ai transparency

[引用][C] Representation engineering: A top-down approach to AI transparency. CoRR, abs/2310.01405, 2023. doi: 10.48550

引用