- 学术资源搜索

H2o: Heavy-hitter oracle for efficient generative inference of large language models

Z Zhang, Y Sheng, T Zhou, T Chen… - Advances in …, 2023 - proceedings.neurips.cc

Abstract Large Language Models (LLMs), despite their recent impressive accomplishments,
are notably cost-prohibitive to deploy, particularly for applications involving long-content …

被引用次数：257 相关文章所有 7 个版本

[PDF] arxiv.org

Attention scheme inspired softmax regression

Y Deng, Z Li, Z Song - arXiv preprint arXiv:2304.10411, 2023 - arxiv.org

Large language models (LLMs) have made transformed changes for human society. One of
the key computation in LLMs is the softmax unit. This operation is important in LLMs …

被引用次数：48 相关文章所有 2 个版本

[PDF] arxiv.org

Algorithm and hardness for dynamic attention maintenance in large language models

J Brand, Z Song, T Zhou - arXiv preprint arXiv:2304.02207, 2023 - arxiv.org

Large language models (LLMs) have made fundamental changes in human life. The
attention scheme is one of the key components over all the LLMs, such as BERT, GPT-1 …

被引用次数：36 相关文章所有 4 个版本

[PDF] arxiv.org

Training multi-layer over-parametrized neural network in subquadratic time

Z Song, L Zhang, R Zhang - arXiv preprint arXiv:2112.07628, 2021 - arxiv.org

We consider the problem of training a multi-layer over-parametrized neural network to
minimize the empirical risk induced by a loss function. In the typical setting of over …

被引用次数：71 相关文章所有 6 个版本

[PDF] arxiv.org

A faster small treewidth sdp solver

Y Gu, Z Song - arXiv preprint arXiv:2211.06033, 2022 - arxiv.org

Semidefinite programming is a fundamental tool in optimization and theoretical computer
science. It has been extensively used as a black-box for solving many problems, such as …

被引用次数：55 相关文章所有 2 个版本

[PDF] openreview.net

Mongoose: A learnable lsh framework for efficient neural network training

B Chen, Z Liu, B Peng, Z Xu, JL Li, T Dao… - International …, 2020 - openreview.net

Recent advances by practitioners in the deep learning community have breathed new life
into Locality Sensitive Hashing (LSH), using it to reduce memory and time bottlenecks in …

被引用次数：81 相关文章所有 3 个版本

[PDF] arxiv.org

A tighter complexity analysis of sparsegpt

X Li, Y Liang, Z Shi, Z Song - arXiv preprint arXiv:2408.12151, 2024 - arxiv.org

In this work, we improved the analysis of the running time of SparseGPT [Frantar, Alistarh
ICML 2023] from $ O (d^{3}) $ to $ O (d^{\omega}+ d^{2+ a+ o (1)}+ d^{1+\omega (1, 1, a)-a}) …

被引用次数：18 相关文章所有 5 个版本

[PDF] neurips.cc

Infoprompt: Information-theoretic soft prompt tuning for natural language understanding

J Wu, T Yu, R Wang, Z Song, R Zhang… - Advances in …, 2024 - proceedings.neurips.cc

Soft prompt tuning achieves superior performances across a wide range of few-shot tasks.
However, the performances of prompt tuning can be highly sensitive to the initialization of …

被引用次数：21 相关文章所有 7 个版本

[PDF] nsf.gov

A faster algorithm for solving general lps

S Jiang, Z Song, O Weinstein, H Zhang - Proceedings of the 53rd Annual …, 2021 - dl.acm.org

The fastest known LP solver for general (dense) linear programs is due to [Cohen, Lee and
Song'19] and runs in O*(n ω+ n 2.5− α/2+ n 2+ 1/6) time. A number of follow-up works [Lee …

被引用次数：73 相关文章所有 3 个版本

[PDF] mlr.press

Sketching meets differential privacy: fast algorithm for dynamic kronecker projection maintenance

Z Song, X Yang, Y Yang… - … Conference on Machine …, 2023 - proceedings.mlr.press

Projection maintenance is one of the core data structure tasks. Efficient data structures for
projection maintenance have led to recent breakthroughs in many convex programming …

被引用次数：30 相关文章所有 7 个版本

高级搜索

QQ 群

H2o: Heavy-hitter oracle for efficient generative inference of large language models

Attention scheme inspired softmax regression

Algorithm and hardness for dynamic attention maintenance in large language models

Training multi-layer over-parametrized neural network in subquadratic time

A faster small treewidth sdp solver

Mongoose: A learnable lsh framework for efficient neural network training

A tighter complexity analysis of sparsegpt

Infoprompt: Information-theoretic soft prompt tuning for natural language understanding

A faster algorithm for solving general lps

Sketching meets differential privacy: fast algorithm for dynamic kronecker projection maintenance

引用