Deferred continuous batching in resource-efficient large language model serving

Y He, Y Lu, G Alonso - Proceedings of the 4th Workshop on Machine …, 2024 - dl.acm.org
Despite that prior work of batched inference and parameter-efficient fine-tuning techniques
have reduced the resource requirements of large language models (LLMs), challenges …

Automation of AD-OHC Dashbord and Monitoring of Cloud Resources using Genrative AI to Reduce Costing and Enhance Performance

P Chavan, P Chavan - 2024 International Conference on …, 2024 - ieeexplore.ieee.org
In this extensive review, the incorporation of Generative Artificial Intelligence (AI) into ad-hoc
dashboards and cloud resource monitoring is investigated in depth. A purpose of this work is …

BOOM: Use your Desktop to Accurately Predict the Performance of Large Deep Neural Networks

Q Su, J Yang, G Pekhimenko - … of the 2024 International Conference on …, 2024 - dl.acm.org
The intensive computational requirements of training deep neural networks (DNNs) have
significantly driven the adoption of DNN accelerators like Graph Processing Units (GPU) …

Large Generative Model-enabled Digital Twin for 6G Networks

Y Yang, W Sun, J He, Y Fu, L Xu - IEEE Network, 2024 - ieeexplore.ieee.org
The next generation (6G) wireless networks are under intensive research and envisioned to
realize the interconnection of everything and ubiquitous intelligence. One of the major …

Performance evaluation of cloud database in terms of response time using tenancy model and in-memory database

A Shah, M Patel, M Patel - 2024 IEEE International Conference …, 2024 - ieeexplore.ieee.org
Cloud databases are now essential parts of contemporary information systems, providing
scalability, flexibility, and cost-effectiveness to enterprises in various industries. Cloud …

[PDF][PDF] Enhancing Operational Data Synthesis and Predictive Analysis in HPC Clusters Using Large Language Models

Y Zang - 2024 - atlarge-research.com
Abstract High-Performance Computing (HPC) clusters are integral to advancing scientific
research, industrial optimization, and various computational tasks. Researchers, industrial …

Intelligent Network Optimization in Cloud Environments with Generative AI and LLMs

K Patil, B Desai - 2024 - preprints.org
This paper represents a groundbreaking paradigm shift in network optimization. Departing
from traditional static methodologies, this innovative approach harnesses the power of …

Generative AI Meets Cloud Networking: A New Era of Dynamic Optimization

H Miyamoto, SNA Tan - Asian American Research Letters Journal, 2024 - aarlj.com
This paper represents a groundbreaking paradigm shift in network optimization. Departing
from traditional static methodologies, this innovative approach harnesses the power of …