Executing synchronous dataflow graphs on a SPM-based multicore architecture

J Choi, H Oh, S Kim, S Ha - Proceedings of the 49th Annual Design …, 2012 - dl.acm.org
In this paper we are concerned about executing synchronous dataflow (SDF) applications
on a multicore architecture where a core has a limited size of scratchpad memory (SPM) …

Towards memory-efficient allocation of CNNs on processing-in-memory architecture

Y Wang, W Chen, J Yang, T Li - IEEE Transactions on Parallel …, 2018 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) have been successfully applied in artificial intelligent
systems to perform sensory processing, sequence learning, and image processing. In …

A differentiated caching mechanism to enable primary storage deduplication in clouds

H Wu, C Wang, Y Fu, S Sakr, K Lu… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Existing primary deduplication techniques either use inline caching to exploit locality in
primary workloads or use post-processing deduplication to avoid the negative impact on I/O …

Memory-aware task scheduling with communication overhead minimization for streaming applications on bus-based multiprocessor system-on-chips

Y Wang, Z Shao, HCB Chan, D Liu… - IEEE transactions on …, 2013 - ieeexplore.ieee.org
Inter-core communication introduces overheads in task schedules on Multiprocessor System-
on-Chips (MPSoCs). Inter-core communication overhead not only negatively impacts the …

Towards memory-efficient processing-in-memory architecture for convolutional neural networks

Y Wang, M Zhang, J Yang - Proceedings of the 18th ACM SIGPLAN …, 2017 - dl.acm.org
Convolutional neural networks (CNNs) are widely adopted in artificial intelligent systems. In
contrast to conventional computing centric applications, the computational and memory …

Memory space recycling

J Ryoo, MT Kandemir, M Karakoy - … of the ACM on Measurement and …, 2022 - dl.acm.org
Many program codes from different application domains process very large amounts of data,
making their cache memory behavior critical for high performance. Most of the existing work …

Towards cross-platform inference on edge devices with emerging neuromorphic architecture

S Wu, Y Wang, AC Zhou, R Mao… - … Design, Automation & …, 2019 - ieeexplore.ieee.org
Deep convolutional neural networks have become the mainstream solution for many
artificial intelligence applications. However, they are still rarely deployed on mobile or edge …

Memory-aware optimal scheduling with communication overhead minimization for streaming applications on chip multiprocessors

Y Wang, D Liu, Z Qin, Z Shao - 2010 31st IEEE Real-Time …, 2010 - ieeexplore.ieee.org
In this paper, we focus on solving the problem of removing inter-core communication
overhead for streaming applications on chip multiprocessors. The objective is to totally …

Postscheduling buffer management trade-offs in streaming software synthesis

MH Foroozannejad, T Hodges, M Hashemi… - ACM Transactions on …, 2012 - dl.acm.org
Streaming applications, which are abundant in many disciplines such as multimedia,
networking, and signal processing, require efficient processing of a seemingly infinite …

SAMOSA: Scratchpad aware mapping of streaming applications

ZW Bhatti, D Preuveneers, Y Berbers… - … on System on Chip …, 2011 - ieeexplore.ieee.org
Scratchpad memories have now emerged as an alternative to caches for energy constrained
embedded systems. However, effectively mapping data on them while considering …