Compressing DMA engine: Leveraging activation sparsity for training deep neural networks M Rhu, M O'Connor, N Chatterjee, J Pool, Y Kwon, SW Keckler 2018 IEEE International Symposium on High Performance Computer Architecture …, 2018 | 227 | 2018 |
Tensordimm: A practical near-memory processing architecture for embeddings and tensor operations in deep learning Y Kwon, Y Lee, M Rhu Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019 | 218 | 2019 |
Centaur: A chiplet-based, hybrid sparse-dense accelerator for personalized recommendations R Hwang, T Kim, Y Kwon, M Rhu 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020 | 105 | 2020 |
Beyond the memory wall: A case for memory-centric hpc system for deep learning Y Kwon, M Rhu 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018 | 84 | 2018 |
Tensor casting: Co-designing algorithm-architecture for personalized recommendation training Y Kwon, Y Lee, M Rhu 2021 IEEE International Symposium on High-Performance Computer Architecture …, 2021 | 42 | 2021 |
NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units B Hyun, Y Kwon, Y Choi, J Kim, M Rhu Proceedings of the Twenty-Fifth International Conference on Architectural …, 2020 | 33 | 2020 |
A disaggregated memory system for deep learning Y Kwon, M Rhu IEEE Micro 39 (5), 82-90, 2019 | 30 | 2019 |
Training personalized recommendation systems from (GPU) scratch: look forward not backwards Y Kwon, M Rhu Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 24 | 2022 |
A case for memory-centric HPC system architecture for training deep neural networks Y Kwon, M Rhu IEEE computer architecture letters 17 (2), 134-138, 2018 | 17 | 2018 |
Understanding the Implication of Non-Volatile Memory for Large-Scale Graph Neural Network Training Y Lee, Y Kwon, M Rhu IEEE Computer Architecture Letters 20 (2), 118-121, 2021 | 9 | 2021 |
High performance computing system for deep learning M Rhu, Y Kwon US Patent App. 16/595,992, 2020 | 4 | 2020 |
Neural network acceleration system and operating method thereof M Rhu, Y Kwon, Y Lee US Patent App. 16/922,333, 2021 | 1 | 2021 |
LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models J Lim, Y Kwon, R Hwang, K Maeng, E Suh, M Rhu Proceedings of the 29th ACM International Conference on Architectural …, 2024 | | 2024 |