EXAMS: A multi-subject high school examinations dataset for cross-lingual and multilingual...

A Dubey, A Jauhri, A Pandey, A Kadian… - arXiv preprint arXiv …, 2024 - arxiv.org

Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

被引用次数：2148 相关文章所有 4 个版本

[PDF] arxiv.org

Bactrian-x: Multilingual replicable instruction-following models with low-rank adaptation

H Li, F Koto, M Wu, AF Aji, T Baldwin - arXiv preprint arXiv:2305.15011, 2023 - arxiv.org

Instruction tuning has shown great promise in improving the performance of large language
models. However, research on multilingual instruction tuning has been limited due to the …

被引用次数：54 相关文章所有 3 个版本

[PDF] arxiv.org

XOR QA: Cross-lingual open-retrieval question answering

A Asai, J Kasai, JH Clark, K Lee, E Choi… - arXiv preprint arXiv …, 2020 - arxiv.org

Multilingual question answering tasks typically assume answers exist in the same language
as the question. Yet in practice, many languages face both information scarcity--where …

被引用次数：142 相关文章所有 4 个版本

[PDF] arxiv.org

The belebele benchmark: a parallel reading comprehension dataset in 122 language variants

L Bandarkar, D Liang, B Muller, M Artetxe… - arXiv preprint arXiv …, 2023 - arxiv.org

We present Belebele, a multiple-choice machine reading comprehension (MRC) dataset
spanning 122 language variants. Significantly expanding the language coverage of natural …

被引用次数：69 相关文章所有 2 个版本

[PDF] arxiv.org

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arXiv preprint arXiv …, 2024 - arxiv.org

Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

被引用次数：55 相关文章所有 2 个版本

[PDF] arxiv.org

DuRecDial 2.0: A bilingual parallel corpus for conversational recommendation

Z Liu, H Wang, ZY Niu, H Wu, W Che - arXiv preprint arXiv:2109.08877, 2021 - arxiv.org

In this paper, we provide a bilingual parallel human-to-human recommendation dialog
dataset (DuRecDial 2.0) to enable researchers to explore a challenging task of multilingual …

被引用次数：51 相关文章所有 6 个版本

[PDF] arxiv.org

Revisiting machine translation for cross-lingual classification

M Artetxe, V Goswami, S Bhosale, A Fan… - arXiv preprint arXiv …, 2023 - arxiv.org

Machine Translation (MT) has been widely used for cross-lingual classification, either by
translating the test set into English and running inference with a monolingual model …

被引用次数：26 相关文章所有 4 个版本

[PDF] arxiv.org

Give me the facts! a survey on factual knowledge probing in pre-trained language models

P Youssef, OA Koraş, M Li, J Schlötterer… - arXiv preprint arXiv …, 2023 - arxiv.org

Pre-trained Language Models (PLMs) are trained on vast unlabeled data, rich in world
knowledge. This fact has sparked the interest of the community in quantifying the amount of …

被引用次数：17 相关文章所有 4 个版本

[PDF] arxiv.org

Acegpt, localizing large language models in arabic

H Huang, F Yu, J Zhu, X Sun, H Cheng, D Song… - arXiv preprint arXiv …, 2023 - arxiv.org

This paper is devoted to the development of a localized Large Language Model (LLM)
specifically for Arabic, a language imbued with unique cultural characteristics inadequately …

被引用次数：12 相关文章所有 6 个版本

[PDF] arxiv.org

Leaf: Multiple-choice question generation

K Vachev, M Hardalov, G Karadzhov… - … on Information Retrieval, 2022 - Springer

Testing with quiz questions has proven to be an effective way to assess and improve the
educational process. However, manually creating quizzes is tedious and time-consuming …

被引用次数：36 相关文章所有 6 个版本

高级搜索

QQ 群