Pretraining data mixtures enable narrow model selection capabilities in transformer models

Y Huang, Y Cheng, Y Liang - arXiv preprint arXiv:2310.05249, 2023 - arxiv.org

Transformers have recently revolutionized many domains in modern machine learning and
one salient discovery is their remarkable in-context learning capability, where models can …

被引用次数：31 相关文章所有 6 个版本

[PDF] aclanthology.org

Competition-level problems are effective llm evaluators

Y Huang, Z Lin, X Liu, Y Gong, S Lu, F Lei… - Findings of the …, 2024 - aclanthology.org

Large language models (LLMs) have demonstrated impressive reasoning capabilities, yet
there is ongoing debate about these abilities and the potential data contamination problem …

被引用次数：8 相关文章所有 2 个版本

[PDF] thecvf.com

What Makes Multimodal In-Context Learning Work?

FB Baldassini, M Shukor, M Cord… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Large Language Models have demonstrated remarkable performance across
various tasks exhibiting the capacity to swiftly acquire new skills such as through In-Context …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Large language models for social networks: Applications, challenges, and solutions

J Zeng, R Huang, W Malik, L Yin, B Babic… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) are transforming the way people generate, explore, and
engage with content. We study how we can develop LLM applications for online social …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Compositional generative modeling: A single model is not all you need

Y Du, L Kaelbling - arXiv preprint arXiv:2402.01103, 2024 - arxiv.org

Large monolithic generative models trained on massive amounts of data have become an
increasingly dominant approach in AI research. In this paper, we argue that we should …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

TY Zhuo, A Zebaze, N Suppattarachai… - arXiv preprint arXiv …, 2024 - arxiv.org

The high cost of full-parameter fine-tuning (FFT) of Large Language Models (LLMs) has led
to a series of parameter-efficient fine-tuning (PEFT) methods. However, it remains unclear …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Assessing logical puzzle solving in large language models: Insights from a minesweeper case study

Y Li, H Wang, C Zhang - arXiv preprint arXiv:2311.07387, 2023 - arxiv.org

Large Language Models (LLMs) have shown remarkable proficiency in language
understanding and have been successfully applied to a variety of real-world tasks through …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Prompting a pretrained transformer can be a universal approximator

A Petrov, PHS Torr, A Bibi - arXiv preprint arXiv:2402.14753, 2024 - arxiv.org

Despite the widespread adoption of prompting, prompt tuning and prefix-tuning of
transformer models, our theoretical understanding of these fine-tuning methods remains …

被引用次数：2 相关文章所有 4 个版本

[PDF] openreview.net

On the Role of Unstructured Training Data in Transformers' In-Context Learning Capabilities

KC Wibisono, Y Wang - NeurIPS 2023 Workshop on Mathematics of …, 2023 - openreview.net

Transformers have exhibited impressive in-context learning (ICL) capabilities: they can
generate predictions for new query inputs based on sequences of inputs and outputs (ie …

被引用次数：5 相关文章所有 2 个版本

[PDF] cambridge.org

The economic institutions of artificial intelligence

S Davidson - Journal of Institutional Economics, 2024 - cambridge.org

This paper explores the role of artificial intelligence (AI) within economic institutions,
focusing on bounded rationality as understood by Herbert Simon. Artificial Intelligence can …

被引用次数：7 相关文章所有 4 个版本

高级搜索

QQ 群