A survey on mobile edge computing for video streaming: Opportunities and challenges

MA Khan, E Baccour, Z Chkirbene, A Erbad… - IEEE …, 2022 - ieeexplore.ieee.org
5G communication brings substantial improvements in the quality of service provided to
various applications by achieving higher throughput and lower latency. However, interactive …

H2o: Heavy-hitter oracle for efficient generative inference of large language models

Z Zhang, Y Sheng, T Zhou, T Chen… - Advances in …, 2023 - proceedings.neurips.cc
Abstract Large Language Models (LLMs), despite their recent impressive accomplishments,
are notably cost-prohibitive to deploy, particularly for applications involving long-content …

A large-scale analysis of hundreds of in-memory key-value cache clusters at twitter

J Yang, Y Yue, KV Rashmi - ACM Transactions on Storage (TOS), 2021 - dl.acm.org
Modern web services use in-memory caching extensively to increase throughput and reduce
latency. There have been several workload analyses of production systems that have fueled …

Joint communication, computation, caching, and control in big data multi-access edge computing

A Ndikumana, NH Tran, TM Ho, Z Han… - IEEE Transactions …, 2019 - ieeexplore.ieee.org
The concept of Multi-access Edge Computing (MEC) has been recently introduced to
supplement cloud computing by deploying MEC servers to the network edge so as to reduce …

Adaptive bitrate video caching and processing in mobile-edge computing networks

TX Tran, D Pompili - IEEE Transactions on Mobile Computing, 2018 - ieeexplore.ieee.org
Mobile-Edge Computing (MEC) is a promising paradigm that provides storage and
computation resources at the network edge in order to support low-latency and computation …

High performance cache replacement using re-reference interval prediction (RRIP)

A Jaleel, KB Theobald, SC Steely Jr… - ACM SIGARCH computer …, 2010 - dl.acm.org
Practical cache replacement policies attempt to emulate optimal replacement by predicting
the re-reference interval of a cache block. The commonly used LRU replacement policy …

Distributed hierarchical gpu parameter server for massive scale deep learning ads systems

W Zhao, D Xie, R Jia, Y Qian, R Ding… - … of Machine Learning …, 2020 - proceedings.mlsys.org
Neural networks of ads systems usually take input from multiple resources, eg query-ad
relevance, ad features and user portraits. These inputs are encoded into one-hot or multi-hot …

Applying deep learning to the cache replacement problem

Z Shi, X Huang, A Jain, C Lin - Proceedings of the 52nd Annual IEEE …, 2019 - dl.acm.org
Despite its success in many areas, deep learning is a poor fit for use in hardware predictors
because these models are impractically large and slow, but this paper shows how we can …

Information-centric mobile caching network frameworks and caching optimization: a survey

H Jin, D Xu, C Zhao, D Liang - EURASIP Journal on Wireless …, 2017 - Springer
The demand for content oriented service and compute-intensive service stimulates the shift
of current cellular networks to deal with the explosive growth in mobile traffic. Information …

FIFO queues are all you need for cache eviction

J Yang, Y Zhang, Z Qiu, Y Yue, R Vinayak - Proceedings of the 29th …, 2023 - dl.acm.org
As a cache eviction algorithm, FIFO has a lot of attractive properties, such as simplicity,
speed, scalability, and flash-friendliness. The most prominent criticism of FIFO is its low …