In-context convergence of transformers

Y Huang, Y Cheng, Y Liang - arXiv preprint arXiv:2310.05249, 2023 - arxiv.org
Transformers have recently revolutionized many domains in modern machine learning and
one salient discovery is their remarkable in-context learning capability, where models can …

Competition-level problems are effective llm evaluators

Y Huang, Z Lin, X Liu, Y Gong, S Lu, F Lei… - Findings of the …, 2024 - aclanthology.org
Large language models (LLMs) have demonstrated impressive reasoning capabilities, yet
there is ongoing debate about these abilities and the potential data contamination problem …

What Makes Multimodal In-Context Learning Work?

FB Baldassini, M Shukor, M Cord… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Large Language Models have demonstrated remarkable performance across
various tasks exhibiting the capacity to swiftly acquire new skills such as through In-Context …

Large language models for social networks: Applications, challenges, and solutions

J Zeng, R Huang, W Malik, L Yin, B Babic… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) are transforming the way people generate, explore, and
engage with content. We study how we can develop LLM applications for online social …

Compositional generative modeling: A single model is not all you need

Y Du, L Kaelbling - arXiv preprint arXiv:2402.01103, 2024 - arxiv.org
Large monolithic generative models trained on massive amounts of data have become an
increasingly dominant approach in AI research. In this paper, we argue that we should …

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

TY Zhuo, A Zebaze, N Suppattarachai… - arXiv preprint arXiv …, 2024 - arxiv.org
The high cost of full-parameter fine-tuning (FFT) of Large Language Models (LLMs) has led
to a series of parameter-efficient fine-tuning (PEFT) methods. However, it remains unclear …

Assessing logical puzzle solving in large language models: Insights from a minesweeper case study

Y Li, H Wang, C Zhang - arXiv preprint arXiv:2311.07387, 2023 - arxiv.org
Large Language Models (LLMs) have shown remarkable proficiency in language
understanding and have been successfully applied to a variety of real-world tasks through …

Prompting a pretrained transformer can be a universal approximator

A Petrov, PHS Torr, A Bibi - arXiv preprint arXiv:2402.14753, 2024 - arxiv.org
Despite the widespread adoption of prompting, prompt tuning and prefix-tuning of
transformer models, our theoretical understanding of these fine-tuning methods remains …

On the Role of Unstructured Training Data in Transformers' In-Context Learning Capabilities

KC Wibisono, Y Wang - NeurIPS 2023 Workshop on Mathematics of …, 2023 - openreview.net
Transformers have exhibited impressive in-context learning (ICL) capabilities: they can
generate predictions for new query inputs based on sequences of inputs and outputs (ie …

The economic institutions of artificial intelligence

S Davidson - Journal of Institutional Economics, 2024 - cambridge.org
This paper explores the role of artificial intelligence (AI) within economic institutions,
focusing on bounded rationality as understood by Herbert Simon. Artificial Intelligence can …