Vladimir Mikulik 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	4329	4326
h 指数	16	16
i10 指数	17	17

0

2200

1100

550

1650

2020202120222023202454 436 689 981 2138

开放获取的出版物数量

2 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

Vladimir Mikulik

Vladimir Mikulik

Anthropic

在 anthropic.com 的电子邮件经过验证

AI alignment AGI


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023	1368	2023
Inferring the effectiveness of government interventions against COVID-19 JM Brauner, S Mindermann, M Sharma, D Johnston, J Salvatier, ... Science 371 (6531), eabd9338, 2021	1021	2021
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021	924	2021
Teaching language models to support answers with verified quotes J Menick, M Trebacz, V Mikulik, J Aslanides, F Song, M Chadwick, ... arXiv preprint arXiv:2203.11147, 2022	175	2022
Alignment of language agents Z Kenton, T Everitt, L Weidinger, I Gabriel, V Mikulik, G Irving arXiv preprint arXiv:2103.14659, 2021	142	2021
Risks from learned optimization in advanced machine learning systems E Hubinger, C van Merwijk, V Mikulik, J Skalse, S Garrabrant arXiv preprint arXiv:1906.01820, 2019	127	2019
The DeepMind JAX Ecosystem, 2020 I Babuschkin, K Baumli, A Bell, S Bhupatiraju, J Bruce, P Buchlovsky, ... URL http://github. com/deepmind 18, 2010	99	2010
Specification gaming: the flip side of AI ingenuity V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ... DeepMind Blog 3, 2020	96	2020
The effectiveness and perceived burden of nonpharmaceutical interventions against COVID-19 transmission: a modelling study with 41 countries JM Brauner, S Mindermann, M Sharma, AB Stephenson, T Gavenčiak, ... MedRxiv, 2020.05. 28.20116129, 2020	84	2020
The DeepMind JAX Ecosystem I Babuschkin, K Baumli, A Bell, S Bhupatiraju, J Bruce, P Buchlovsky, ... URL http://github. com/deepmind 24, 25, 2020	62	2020
Tracr: Compiled transformers as a laboratory for interpretability D Lindner, J Kramár, S Farquhar, M Rahtz, T McGrath, V Mikulik Advances in Neural Information Processing Systems 36, 2024	45	2024
Does circuit analysis interpretability scale? evidence from multiple choice capabilities in chinchilla T Lieberum, M Rahtz, J Kramár, N Nanda, G Irving, R Shah, V Mikulik arXiv preprint arXiv:2307.09458, 2023	43	2023
Meta-trained agents implement bayes-optimal agents V Mikulik, G Delétang, T McGrath, T Genewein, M Martic, S Legg, ... Advances in neural information processing systems 33, 18691-18703, 2020	41	2020
The hydra effect: Emergent self-repair in language model computations T McGrath, M Rahtz, J Kramar, V Mikulik, S Legg arXiv preprint arXiv:2307.15771, 2023	35	2023
Neural networks are a priori biased towards boolean functions with low entropy C Mingard, J Skalse, G Valle-Pérez, D Martínez-Rubio, V Mikulik, ... arXiv preprint arXiv:1909.11522, 2019	28	2019
Algorithms for causal reasoning in probability trees T Genewein, T McGrath, G Delétang, V Mikulik, M Martic, S Legg, ... arXiv preprint arXiv:2010.12237, 2020	20	2020
Causal analysis of agent behavior for ai safety G Déletang, J Grau-Moya, M Martic, T Genewein, T McGrath, V Mikulik, ... arXiv preprint arXiv:2103.03938, 2021	10	2021
Challenges with unsupervised LLM knowledge discovery S Farquhar, V Varma, Z Kenton, J Gasteiger, V Mikulik, R Shah arXiv preprint arXiv:2312.10029, 2023	9	2023

系统目前无法执行此操作，请稍后再试。

文章 1–18