Several classes of applications with abundant fine-grain parallelism, such as media and signal processing, graphics, and scientific computing, have become increasingly dominant …
M Dusanapudi, S Kamaraju, S Kapoor - US Patent 9,501,408, 2016 - Google Patents
(57) ABSTRACT A method of testing cache coherency in a computer system design allocates different portions of a single cache line for use by accelerators and processors …
Power consumption is an important design issue of current multimedia embedded systems. Data caches consume a significant portion of total processor power for multimedia …
M Dusanapudi, S Kamaraju, S Kapoor - US Patent App. 14/038,125, 2014 - Google Patents
(57) ABSTRACT A method of testing cache coherency in a computer system design allocates different portions of a single cache line for use by accelerators and processors …
Run-time code generation (RTCG) has been shown to be an effective optimization. Systems such as DyC,'C, Tempo, and Fabius have demonstrated order of magnitude improvements …
N Wu, M Wen, J Ren, Y He, CQ Xun… - … Conference on High …, 2009 - ieeexplore.ieee.org
Due to high bandwidth demand on memory system of stream applications, most of stream processors use software-managed streaming memory. However, this memory …
P Faldu - arXiv preprint arXiv:2006.08487, 2020 - arxiv.org
Last-Level Cache (LLC) represents the bulk of a modern CPU processor's transistor budget and is essential for application performance as LLC enables fast access to data in contrast …