Apple intelligence foundation language models

H Bansal, A Hosseini, R Agarwal, VQ Tran… - arXiv preprint arXiv …, 2024 - arxiv.org

Training on high-quality synthetic data from strong language models (LMs) is a common
strategy to improve the reasoning performance of LMs. In this work, we revisit whether this …

被引用次数：21 相关文章所有 3 个版本

[PDF] arxiv.org

Mm1. 5: Methods, analysis & insights from multimodal llm fine-tuning

H Zhang, M Gao, Z Gan, P Dufter, N Wenzel… - arXiv preprint arXiv …, 2024 - arxiv.org

We present MM1. 5, a new family of multimodal large language models (MLLMs) designed
to enhance capabilities in text-rich image understanding, visual referring and grounding …

被引用次数：14 相关文章所有 3 个版本

[PDF] acm.org

Llm for mobile: An initial roadmap

D Chen, Y Liu, M Zhou, Y Zhao, H Wang… - ACM Transactions on …, 2024 - dl.acm.org

When mobile meets LLMs, mobile app users deserve to have more intelligent usage
experiences. For this to happen, we argue that there is a strong need to apply LLMs for the …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Cloud Platforms for Developing Generative AI Solutions: A Scoping Review of Tools and Services

D Patel, G Raut, SN Cheetirala, GN Nadkarni… - arXiv preprint arXiv …, 2024 - arxiv.org

Generative AI is transforming enterprise application development by enabling machines to
create content, code, and designs. These models, however, demand substantial …

Mobileviews: A large-scale mobile gui dataset

L Gao, L Zhang, S Wang, S Wang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Mobile screen assistants help smartphone users by interpreting mobile screens and
responding to user requests. The excessive private information on mobile screens …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Power scheduler: A batch size and token number agnostic learning rate scheduler

Y Shen, M Stallone, M Mishra, G Zhang, S Tan… - arXiv preprint arXiv …, 2024 - arxiv.org

Finding the optimal learning rate for language model pretraining is a challenging task. This
is not only because there is a complicated correlation between learning rate, batch size …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Compress then serve: Serving thousands of lora adapters with little overhead

R Brüel-Gabrielsson, J Zhu, O Bhardwaj… - arXiv preprint arXiv …, 2024 - arxiv.org

Fine-tuning large language models (LLMs) with low-rank adaptations (LoRAs) has become
common practice, often yielding numerous copies of the same LLM differing only in their …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Instruction-Following Pruning for Large Language Models

B Hou, Q Chen, J Wang, G Yin, C Wang, N Du… - arXiv preprint arXiv …, 2025 - arxiv.org

With the rapid scaling of large language models (LLMs), structured pruning has become a
widely used technique to learn efficient, smaller models from larger ones, delivering superior …

Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning

Y Zeng, X Ding, Y Wang, W Liu, W Ning, Y Hou… - arXiv preprint arXiv …, 2025 - arxiv.org

Augmenting large language models (LLMs) with external tools is a promising approach to
enhance their capabilities. Effectively leveraging this potential for complex tasks hinges …

[HTML][HTML] Sparse Convolution FPGA Accelerator Based on Multi-Bank Hash Selection

J Xu, H Pu, D Wang - Micromachines, 2024 - mdpi.com

Reconfigurable processor-based acceleration of deep convolutional neural network (DCNN)
algorithms has emerged as a widely adopted technique, with particular attention on sparse …

高级搜索

QQ 群