Multi-layer transformers gradient can be approximated in almost linear time

Y Liang, Z Sha, Z Shi, Z Song, Y Zhou - arXiv preprint arXiv:2408.13233, 2024 - arxiv.org
The computational complexity of the self-attention mechanism in popular transformer
architectures poses significant challenges for training and inference, and becomes the …

Retrieval augmented generation or long-context llms? a comprehensive study and hybrid approach

Z Li, C Li, M Zhang, Q Mei… - Proceedings of the 2024 …, 2024 - aclanthology.org
Abstract Retrieval Augmented Generation (RAG) has been a powerful tool for Large
Language Models (LLMs) to efficiently process overly lengthy contexts. However, recent …

Longmemeval: Benchmarking chat assistants on long-term interactive memory

D Wu, H Wang, W Yu, Y Zhang, KW Chang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent large language model (LLM)-driven chat assistant systems have integrated memory
components to track user-assistant chat histories, enabling more accurate and personalized …

Flooding spread of manipulated knowledge in llm-based multi-agent communities

T Ju, Y Wang, X Ma, P Cheng, H Zhao, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid adoption of large language models (LLMs) in multi-agent systems has highlighted
their impressive capabilities in various applications, such as collaborative problem-solving …

Circuit Complexity Bounds for RoPE-based Transformer Architecture

B Chen, X Li, Y Liang, J Long, Z Shi, Z Song - arXiv preprint arXiv …, 2024 - arxiv.org
Characterizing the express power of the Transformer architecture is critical to understanding
its capacity limits and scaling law. Recent works provide the circuit complexity bounds to …

What is Wrong with Perplexity for Long-context Language Modeling?

L Fang, Y Wang, Z Liu, C Zhang, S Jegelka… - arXiv preprint arXiv …, 2024 - arxiv.org
Handling long-context inputs is crucial for large language models (LLMs) in tasks such as
extended conversations, document summarization, and many-shot in-context learning …

Moba: A two-level agent system for efficient mobile task automation

Z Zhu, H Tang, Y Li, K Lan, Y Jiang, H Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org
Current mobile assistants are limited by dependence on system APIs or struggle with
complex user instructions and diverse interfaces due to restricted comprehension and …

Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge

YJ Lee, D Lee, J Youn, K Oh, B Ko, J Hyeon… - arXiv preprint arXiv …, 2024 - arxiv.org
Humans share a wide variety of images related to their personal experiences within
conversations via instant messaging tools. However, existing works focus on (1) image …

Memsim: A bayesian simulator for evaluating memory of llm-based personal assistants

Z Zhang, Q Dai, L Chen, Z Jiang, R Li, J Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org
LLM-based agents have been widely applied as personal assistants, capable of memorizing
information from user messages and responding to personal queries. However, there still …

Graph representations for machine translation in dialogue settings

L Krause, SB Santamaria, JC Kalo - Proceedings of the Ninth …, 2024 - aclanthology.org
In this paper, we present our approach to the WMT24-Chat Task, addressing the challenge
of translating chat conversations. Chat conversations are characterised by their informal …