Adaptive Estimators Show Information Compression in Deep Neural Networks I Chelombiev, C Houghton, C O'Donnell International Conference on Learning Representations (ICLR) 2019, 2019 | 48 | 2019 |
Towards structured dynamic sparse pre-training of bert A Dietrich, F Gressmann, D Orr, I Chelombiev, D Justus, C Luschi arXiv preprint arXiv:2108.06277, 2021 | 9 | 2021 |
Sparq attention: Bandwidth-efficient llm inference L Ribar, I Chelombiev, L Hudlass-Galley, C Blake, C Luschi, D Orr arXiv preprint arXiv:2312.04985, 2023 | 6 | 2023 |
Groupbert: Enhanced transformer architecture with efficient grouped structures I Chelombiev, D Justus, D Orr, A Dietrich, F Gressmann, A Koliousis, ... arXiv preprint arXiv:2106.05822, 2021 | 5 | 2021 |
Adaptive estimators show information compression in deep neural networks. arXiv 2019 I Chelombiev, C Houghton, C O’Donnell arXiv preprint arXiv:1902.09037, 0 | 5 | |
Dynamic Sparse Pre-Training of BERT ASD Dietrich, F Gressmann, D Orr, I Chelombiev, D Justus, C Luschi | | |