Efficient memory management for large language model serving with pagedattention

W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng… - Proceedings of the 29th …, 2023 - dl.acm.org
High throughput serving of large language models (LLMs) requires batching sufficiently
many requests at a time. However, existing systems struggle because the key-value cache …

4d gaussian splatting for real-time dynamic scene rendering

G Wu, T Yi, J Fang, L Xie, X Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Representing and rendering dynamic scenes has been an important but challenging task.
Especially to accurately model complex motions high efficiency is usually hard to guarantee …

Vipergpt: Visual inference via python execution for reasoning

D Surís, S Menon, C Vondrick - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Answering visual queries is a complex task that requires both visual processing and
reasoning. End-to-end models, the dominant approach for this task, do not explicitly …

Objaverse-xl: A universe of 10m+ 3d objects

M Deitke, R Liu, M Wallingford, H Ngo… - Advances in …, 2024 - proceedings.neurips.cc
Natural language processing and 2D vision models have attained remarkable proficiency on
many tasks primarily by escalating the scale of training data. However, 3D vision tasks have …

ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers

Y Balaji, S Nah, X Huang, A Vahdat, J Song… - arXiv preprint arXiv …, 2022 - arxiv.org
Large-scale diffusion-based generative models have led to breakthroughs in text-
conditioned high-resolution image synthesis. Starting from random noise, such text-to-image …

Nerfstudio: A modular framework for neural radiance field development

M Tancik, E Weber, E Ng, R Li, B Yi, T Wang… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org
Neural Radiance Fields (NeRF) are a rapidly growing area of research with wide-ranging
applications in computer vision, graphics, robotics, and more. In order to streamline the …

Robust deep learning–based protein sequence design using ProteinMPNN

J Dauparas, I Anishchenko, N Bennett, H Bai… - Science, 2022 - science.org
Although deep learning has revolutionized protein structure prediction, almost all
experimentally characterized de novo protein designs have been generated using …

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc
Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

[PDF][PDF] Timesnet: Temporal 2d-variation modeling for general time series analysis

H Wu, T Hu, Y Liu, H Zhou, J Wang, M Long - arXiv preprint arXiv …, 2022 - arxiv.org
Time series analysis is of immense importance in extensive applications, such as weather
forecasting, anomaly detection, and action recognition. This paper focuses on temporal …

Large language models are zero-shot reasoners

T Kojima, SS Gu, M Reid, Y Matsuo… - Advances in neural …, 2022 - proceedings.neurips.cc
Pretrained large language models (LLMs) are widely used in many sub-fields of natural
language processing (NLP) and generally known as excellent few-shot learners with task …