Experimental evaluation of emerging multi-core architectures A Kayi, Y Yao, T El-Ghazawi, G Newby 2007 IEEE International Parallel and Distributed Processing Symposium, 1-6, 2007 | 38 | 2007 |
Performance issues in emerging homogeneous multi-core architectures A Kayi, T El-Ghazawi, GB Newby Simulation Modelling Practice and Theory 17 (9), 1485-1499, 2009 | 33 | 2009 |
Comparing runtime systems with exascale ambitions using the parallel research kernels RF Van der Wijngaart, A Kayi, JR Hammond, G Jost, T St. John, ... High Performance Computing: 31st International Conference, ISC High …, 2016 | 26 | 2016 |
A Highly-Efficient Distributed Deep Learning System For Automatic Speech Recognition MP Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi ... Interspeech, 2019 | 21* | 2019 |
Adaptive cache coherence mechanisms with producer–consumer sharing optimization for chip multiprocessors A Kayi, O Serres, T El-Ghazawi IEEE Transactions on Computers 64 (2), 316-328, 2013 | 20 | 2013 |
Improving efficiency in large-scale decentralized distributed training W Zhang, X Cui, A Kayi, M Liu, U Finkler, B Kingsbury, G Saon, Y Mroueh, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 15 | 2020 |
Using the parallel research kernels to study PGAS models RF Van der Wijngaart, S Sridharan, A Kayi, G Jost, JR Hammond, ... 2015 9th International Conference on Partitioned Global Address Space …, 2015 | 15 | 2015 |
Address translation optimization for Unified Parallel C multi-dimensional arrays O Serres, A Anbar, SG Merchant, A Kayi, T El-Ghazawi 2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011 | 15 | 2011 |
An adaptive cache coherence protocol for chip multiprocessors A Kayi, T El-Ghazawi Proceedings of the Second International Forum on Next-Generation Multicore …, 2010 | 15 | 2010 |
Performance evaluation of clusters with ccnuma nodes-a case study A Kayi, E Kornkven, T El-Ghazawi, S Al-Bahra, GB Newby 2008 10th IEEE International Conference on High Performance Computing and …, 2008 | 12 | 2008 |
Disaggregated system domain JA Kahle, CR Johns, C Evangelinos, A Kayi US Patent 11,561,844, 2023 | 11 | 2023 |
Application performance tuning for clusters with ccnuma nodes A Kayi, E Kornkven, T El-Ghazawi, G Newby 2008 11th IEEE international conference on computational science and …, 2008 | 10 | 2008 |
Asynchronous decentralized distributed training of acoustic models X Cui, W Zhang, A Kayi, M Liu, U Finkler, B Kingsbury, G Saon, D Kung IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3565-3576, 2021 | 5 | 2021 |
Enabling PGAS productivity with hardware support for shared address mapping: A UPC case study O Serres, A Kayi, A Anbar, T El-Ghazawi ACM Transactions on Architecture and Code Optimization (TACO) 12 (4), 1-26, 2015 | 4 | 2015 |
Hardware support for address mapping in PGAS languages: a UPC case study O Serres, A Kayi, A Anbar, T El-Ghazawi Proceedings of the 11th ACM Conference on Computing Frontiers, 1-2, 2014 | 4 | 2014 |
Performance analysis and tuning for clusters with ccNUMA nodes for scientific computing--A case study A Kayi, E Kornkven, T El-Ghazawi, S Al-Bahra, GB Newby Computer Systems Science and Engineering 24 (5), 291, 2010 | 4 | 2010 |
Bandwidth adaptive write-update optimizations for chip multiprocessors A Kayi, O Serres, T El-Ghazawi 2012 IEEE 10th International Symposium on Parallel and Distributed …, 2012 | 3 | 2012 |
Dynamic network bandwidth in distributed deep learning training W Zhang, X Cui, A Kayi, A Buyuktosunoglu US Patent 11,886,969, 2024 | 2 | 2024 |
Dynamic computation in decentralized distributed deep learning training W Zhang, X Cui, A Kayi, A Buyuktosunoglu US Patent 11,875,256, 2024 | 2 | 2024 |
Updating of statistical sets for decentralized distributed training of a machine learning model X Cui, W Zhang, M Liu, A Kayi, Y Mroueh, A Buyuktosunoglu US Patent 11,636,280, 2023 | 2 | 2023 |