We present MM1. 5, a new family of multimodal large language models (MLLMs) designed to enhance capabilities in text-rich image understanding, visual referring and grounding …
When mobile meets LLMs, mobile app users deserve to have more intelligent usage experiences. For this to happen, we argue that there is a strong need to apply LLMs for the …
Generative AI is transforming enterprise application development by enabling machines to create content, code, and designs. These models, however, demand substantial …
L Gao, L Zhang, S Wang, S Wang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Mobile screen assistants help smartphone users by interpreting mobile screens and responding to user requests. The excessive private information on mobile screens …
Y Shen, M Stallone, M Mishra, G Zhang, S Tan… - arXiv preprint arXiv …, 2024 - arxiv.org
Finding the optimal learning rate for language model pretraining is a challenging task. This is not only because there is a complicated correlation between learning rate, batch size …
Fine-tuning large language models (LLMs) with low-rank adaptations (LoRAs) has become common practice, often yielding numerous copies of the same LLM differing only in their …
B Hou, Q Chen, J Wang, G Yin, C Wang, N Du… - arXiv preprint arXiv …, 2025 - arxiv.org
With the rapid scaling of large language models (LLMs), structured pruning has become a widely used technique to learn efficient, smaller models from larger ones, delivering superior …
Y Zeng, X Ding, Y Wang, W Liu, W Ning, Y Hou… - arXiv preprint arXiv …, 2025 - arxiv.org
Augmenting large language models (LLMs) with external tools is a promising approach to enhance their capabilities. Effectively leveraging this potential for complex tasks hinges …
J Xu, H Pu, D Wang - Micromachines, 2024 - mdpi.com
Reconfigurable processor-based acceleration of deep convolutional neural network (DCNN) algorithms has emerged as a widely adopted technique, with particular attention on sparse …