ShenTu: processing multi-trillion edge graphs on millions of cores in seconds H Lin, X Zhu, B Yu, X Tang, W Xue, W Chen, L Zhang, T Hoefler, X Ma, ... Proceedings of the International Conference for High Performance Computing …, 2018 | 63 | 2018 |
Cost-effective cloud HPC resource provisioning by building semi-elastic virtual clusters S Niu, J Zhai, X Ma, X Tang, W Chen Proceedings of the International Conference on High Performance Computing …, 2013 | 59 | 2013 |
Scalable graph traversal on sunway taihulight with ten million cores H Lin, X Tang, B Yu, Y Zhuo, W Chen, J Zhai, W Yin, W Zheng 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017 | 44 | 2017 |
Cypress: Combining static and dynamic analysis for top-down communication trace compression J Zhai, J Hu, X Tang, X Ma, W Chen SC'14: Proceedings of the International Conference for High Performance …, 2014 | 42 | 2014 |
Building semi-elastic virtual clusters for cost-effective HPC cloud resource provisioning S Niu, J Zhai, X Ma, X Tang, W Chen, W Zheng IEEE Transactions on Parallel and Distributed Systems 27 (7), 1915-1928, 2015 | 32 | 2015 |
Spindle: Informed memory access monitoring H Wang, J Zhai, X Tang, B Yu, X Ma, W Chen 2018 USENIX Annual Technical Conference (USENIX ATC 18), 561-574, 2018 | 21 | 2018 |
plock: A fast lock for architectures with explicit inter-core message passing X Tang, J Zhai, X Qian, W Chen Proceedings of the Twenty-Fourth International Conference on Architectural …, 2019 | 15 | 2019 |
Self-Checkpoint: An In-Memory Checkpoint Method Using Less Space and Its Practice on Fault-Tolerant HPL X Tang, J Zhai, B Yu, W Chen, W Zheng Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of …, 2017 | 15 | 2017 |
Spread-n-share: improving application performance and cluster throughput with resource-aware job placement X Tang, H Wang, X Ma, N El-Sayed, J Zhai, W Chen, A Aboulnaga Proceedings of the International Conference for High Performance Computing …, 2019 | 14 | 2019 |
An Efficient In-Memory Checkpoint Method and its Practice on Fault-Tolerant HPL X Tang, J Zhai, B Yu, W Chen, W Zheng, K Li IEEE Transactions on Parallel and Distributed Systems 29 (4), 758-771, 2018 | 13 | 2018 |
ScalAna: Automating scaling loss detection with graph analysis Y Jin, H Wang, T Yu, X Tang, T Hoefler, X Liu, J Zhai SC20: International Conference for High Performance Computing, Networking …, 2020 | 10 | 2020 |
Processing multi-trillion edge graphs on millions of cores in seconds H Lin, X Zhu, B Yu, X Tang, W Xue, W Chen, L Zhang, T Hoefler, X Ma, ... ACM/IEEE International Conference for High Performance Computing, Networking …, 2018 | 9 | 2018 |
vSensor: leveraging fixed-workload snippets of programs for performance variance detection X Tang, J Zhai, X Qian, B He, W Xue, W Chen Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of …, 2018 | 7 | 2018 |
Vapro: Performance variance detection and diagnosis for production-run parallel applications L Zheng, J Zhai, X Tang, H Wang, T Yu, Y Jin, SL Song, W Chen Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of …, 2022 | 5 | 2022 |
Identifying scalability bottlenecks for large-scale parallel programs with graph analysis Y Jin, H Wang, X Tang, T Hoefler, X Liu, J Zhai Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of …, 2020 | 2 | 2020 |
Detecting performance variance for parallel applications without source code J Zhai, L Zheng, F Zhang, X Tang, H Wang, T Yu, Y Jin, SL Song, W Chen IEEE Transactions on Parallel and Distributed Systems 33 (12), 4239-4255, 2022 | 1 | 2022 |
Leveraging code snippets to detect variations in the performance of HPC systems J Zhai, L Zheng, J Sun, F Zhang, X Tang, X Qian, B He, W Xue, W Chen, ... IEEE Transactions on Parallel and Distributed Systems 33 (12), 3558-3574, 2022 | 1 | 2022 |
Sparker: Efficient Reduction for More Scalable Machine Learning with Spark B Yu, H Cao, T Shan, H Wang, X Tang, W Chen Proceedings of the 50th International Conference on Parallel Processing, 1-11, 2021 | 1 | 2021 |
A Fast Lock for Explicit Message Passing Architectures X Tang, C Zhang, J Zhai, X Qian, W Chen, Y Jiang IEEE Transactions on Computers 70 (10), 1555-1568, 2020 | | 2020 |