Parallel programming with migratable objects: Charm++ in practice B Acun, A Gupta, N Jain, A Langer, H Menon, E Mikida, X Ni, M Robson, ... SC'14: Proceedings of the International Conference for High Performance …, 2014 | 238 | 2014 |
A scalable double in-memory checkpoint and restart scheme towards exascale G Zheng, X Ni, LV Kalé IEEE/IFIP International Conference on Dependable Systems and Networks …, 2012 | 163 | 2012 |
ACR: Automatic checkpoint/restart for soft and hard error protection X Ni, E Meneses, N Jain, LV Kalé Proceedings of the international conference on high performance computing …, 2013 | 112 | 2013 |
Maximizing throughput on a dragonfly network N Jain, A Bhatele, X Ni, NJ Wright, LV Kale SC'14: Proceedings of the International Conference for High Performance …, 2014 | 97 | 2014 |
Hiding checkpoint overhead in HPC applications with a semi-blocking algorithm X Ni, E Meneses, LV Kalé 2012 IEEE International Conference on Cluster Computing, 364-372, 2012 | 57 | 2012 |
Using migratable objects to enhance fault tolerance schemes in supercomputers E Meneses, X Ni, G Zheng, CL Mendes, LV Kale IEEE transactions on parallel and distributed systems 26 (7), 2061-2074, 2014 | 49 | 2014 |
Migratable objects+ active messages+ adaptive runtime= productivity+ performance a submission to 2012 HPC class II challenge L Kale, A Arya, N Jain, A Langer, J Lifflander, H Menon, X Ni, Y Sun, ... Parallel Programming Laboratory, Tech. Rep, 12-47, 2012 | 33 | 2012 |
Partitioning low-diameter networks to eliminate inter-job interference N Jain, A Bhatele, X Ni, T Gamblin, LV Kale 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017 | 27 | 2017 |
Generalizable resource allocation in stream processing via deep reinforcement learning X Ni, J Li, M Yu, W Zhou, KL Wu Proceedings of the AAAI Conference on Artificial Intelligence 34 (01), 857-864, 2020 | 26 | 2020 |
A message-logging protocol for multicore systems E Meneses, X Ni, LV Kalé IEEE/IFIP International Conference on Dependable Systems and Networks …, 2012 | 25 | 2012 |
Analyzing the interplay of failures and workload on a leadership-class supercomputer E Meneses, X Ni, T Jones, D Maxwell computing 2 (3), 4, 2015 | 23 | 2015 |
FlipBack: automatic targeted protection against silent data corruption X Ni, LV Kale SC'16: Proceedings of the International Conference for High Performance …, 2016 | 19 | 2016 |
Lossy compression for checkpointing: Fallible or feasible? X Ni, T Islam, K Mohror, A Moody, LV Kale Proceedings of the International Conference For High Performance Computing …, 2014 | 18 | 2014 |
Scalable asynchronous contact mechanics using Charm++ X Ni, LV Kale, R Tamstorf 2015 IEEE International Parallel and Distributed Processing Symposium, 677-686, 2015 | 14 | 2015 |
The Charm++ parallel programming system L Kalé, B Acun, S Bak, A Becker, M Bhandarkar, N Bhat, A Bhatele, ... Aug, 2019 | 12 | 2019 |
A memory heterogeneity-aware runtime system for bandwidth-sensitive HPC applications K Chandrasekar, X Ni, LV Kale 2017 IEEE International Parallel and Distributed Processing Symposium …, 2017 | 10 | 2017 |
Mitigation of failures in high performance computing via runtime techniques X Ni University of Illinois at Urbana-Champaign, 2016 | 8 | 2016 |
Design and analysis of a message logging protocol for fault tolerant multicore systems E Meneses, X Ni, LV Kalé Parallel Programming Laboratory, Department of Computer Science, University …, 2011 | 8 | 2011 |
Automating multi-level performance elastic components for IBM streams X Ni, S Schneider, R Pavuluri, J Kaus, KL Wu Proceedings of the 20th International Middleware Conference, 163-175, 2019 | 7 | 2019 |
Runtime Techniques for Programming with Fast and Slow Memory X Ni, N Jain, K Chandrasekar, LV Kale 2017 IEEE International Conference on Cluster Computing (CLUSTER), 147-151, 2017 | 3 | 2017 |