A case study of instruction tuning with mixture of parameter-efficient experts

文章

学术资源搜索

获得 5 条结果（用时0.05秒）

A case study of instruction tuning with mixture of parameter-efficient experts

A survey on mixture of experts

W Cai, J Jiang, F Wang, J Tang, S Kim… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) have garnered unprecedented advancements across
diverse fields, ranging from natural language processing to computer vision and beyond …

被引用次数：41 相关文章所有 4 个版本

[PDF] arxiv.org

Retrieval-augmented mixture of lora experts for uploadable machine learning

Z Zhao, L Gan, G Wang, Y Hu, T Shen, H Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models
(LLMs). Its modular and plug-and-play nature allows the integration of various domain …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Towards modular llms by building and reusing a library of loras

O Ostapenko, Z Su, EM Ponti, L Charlin… - arXiv preprint arXiv …, 2024 - arxiv.org

The growing number of parameter-efficient adaptations of a base large language model
(LLM) calls for studying whether we can reuse such trained adapters to improve …

被引用次数：17 相关文章所有 3 个版本

[PDF] arxiv.org

Mixture of Experts Using Tensor Products

Z Su, F Mo, P Tiwari, B Wang, JY Nie… - arXiv preprint arXiv …, 2024 - arxiv.org

In multi-task learning, the conventional approach involves training a model on multiple tasks
simultaneously. However, the training signals from different tasks can interfere with one …

被引用次数：3 相关文章所有 2 个版本

[PDF] ku.dk

[PDF][PDF] Information Propagation in Modular Language Modeling and Web Tracking

Z Su - 2024 - di.ku.dk

Abstract Information propagation is the process through which data are transmitted within a
system. The growth of large-scale web datasets has led to explosive growth in information …

高级搜索

QQ 群

A case study of instruction tuning with mixture of parameter-efficient experts

A survey on mixture of experts

Retrieval-augmented mixture of lora experts for uploadable machine learning

Towards modular llms by building and reusing a library of loras

Mixture of Experts Using Tensor Products

[PDF][PDF] Information Propagation in Modular Language Modeling and Web Tracking

引用