Bloom: A 176b-parameter open-access multilingual language model

T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow… - 2023 - inria.hal.science
Large language models (LLMs) have been shown to be able to perform new tasks based on
a few demonstrations or natural language instructions. While these capabilities have led to …

Starcoder: may the source be with you!

R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov… - arXiv preprint arXiv …, 2023 - arxiv.org
The BigCode community, an open-scientific collaboration working on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder and …

Starcoder 2 and the stack v2: The next generation

A Lozhkov, R Li, LB Allal, F Cassano… - arXiv preprint arXiv …, 2024 - arxiv.org
The BigCode project, an open-scientific collaboration focused on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In …

Aya model: An instruction finetuned open-access multilingual language model

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

The ROOTS search tool: Data transparency for LLMs

A Piktus, C Akiki, P Villegas, H Laurençon… - arXiv preprint arXiv …, 2023 - arxiv.org
ROOTS is a 1.6 TB multilingual text corpus developed for the training of BLOOM, currently
the largest language model explicitly accompanied by commensurate data governance …

A literature survey on open source large language models

S Kukreja, T Kumar, A Purohit, A Dasgupta… - Proceedings of the 2024 …, 2024 - dl.acm.org
Since the 1950s, post the Turing test, humans have been striving hard to make machines
learn the art of mastering linguistic intelligence. Language being a complex and intricate tool …

Building open-source AI

YR Shrestha, G von Krogh, S Feuerriegel - Nature Computational …, 2023 - nature.com
Building open-source AI | Nature Computational Science Skip to main content Thank you for
visiting nature.com. You are using a browser version with limited support for CSS. To obtain …

[HTML][HTML] The AI community building the future? A quantitative analysis of development activity on Hugging Face Hub

C Osborne, J Ding, HR Kirk - Journal of Computational Social Science, 2024 - Springer
Open model developers have emerged as key actors in the political economy of artificial
intelligence (AI), but we still have a limited understanding of collaborative practices in the …

Spacerini: Plug-and-play search engines with pyserini and Hugging Face

C Akiki, O Ogundepo, A Piktus, X Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
We present Spacerini, a modular framework for seamless building and deployment of
interactive search applications, designed to facilitate the qualitative analysis of large scale …

Stronger together: on the articulation of ethical charters, legal tools, and technical documentation in ML

G Pistilli, C Muñoz Ferrandis, Y Jernite… - Proceedings of the 2023 …, 2023 - dl.acm.org
The growing need for accountability of the people behind AI systems can be addressed by
leveraging processes in three fields of study: ethics, law, and computer science. While these …