FedMef: Towards Memory-efficient Federated Dynamic Pruning

H Huang, W Zhuang, C Chen… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Federated learning (FL) promotes decentralized training while prioritizing data
confidentiality. However its application on resource-constrained devices is challenging due …

Bed: A real-time object detection system for edge devices

G Wang, ZP Bhat, Z Jiang, YW Chen, D Zha… - Proceedings of the 31st …, 2022 - dl.acm.org
Deploying deep neural networks (DNNs) on edge devices provides efficient and effective
solutions for the real-world tasks. Edge devices have been used for collecting a large …

[PDF][PDF] KIVI: Plug-and-play 2bit KV Cache Quantization with Streaming Asymmetric Quantization

Z Liu, J Yuan, H Jin, S Zhong, Z Xu, V Braverman… - 2024 - researchgate.net
Efficiently serving large language models (LLMs) requires batching many requests together
to reduce the cost per request. However, this approach faces challenges due to the …

[引用][C] Squeezing Efficiency: Exploring Activation and Gradient Compression in Deep Learning

K Kumar, H Arsalan