- 学术资源搜索

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji… - Findings of the …, 2023 - aclanthology.org

We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

被引用次数：917 相关文章所有 7 个版本

[PDF] arxiv.org

Aya model: An instruction finetuned open-access multilingual language model

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

被引用次数：46 相关文章所有 3 个版本

[PDF] arxiv.org

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arXiv preprint arXiv …, 2024 - arxiv.org

Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

被引用次数：23 相关文章所有 2 个版本

[PDF] arxiv.org

Aya dataset: An open-access collection for multilingual instruction tuning

S Singh, F Vargus, D Dsouza, BF Karlsson… - arXiv preprint arXiv …, 2024 - arxiv.org

Datasets are foundational to many breakthroughs in modern artificial intelligence. Many
recent achievements in the space of natural language processing (NLP) can be attributed to …

被引用次数：31 相关文章所有 2 个版本

[PDF] arxiv.org

GlobalBench: A benchmark for global progress in natural language processing

Y Song, C Cui, S Khanuja, P Liu, F Faisal… - arXiv preprint arXiv …, 2023 - arxiv.org

Despite the major advances in NLP, significant disparities in NLP system performance
across languages still exist. Arguably, these are due to uneven resource allocation and sub …

被引用次数：8 相关文章所有 4 个版本

[PDF] arxiv.org

Findings of the 2023 ml-superb challenge: Pre-training and evaluation over more languages and beyond

J Shi, W Chen, D Berrebbi, HH Wang… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge
expands upon the acclaimed SUPERB framework, emphasizing self-supervised models in …

被引用次数：4 相关文章所有 5 个版本

Leveraging the Multilingual Indonesian Ethnic Languages Dataset In Self-Supervised Models for Low-Resource ASR Task

S Sakti, BA Titalim - 2023 IEEE Automatic Speech Recognition …, 2023 - ieeexplore.ieee.org

Indonesia is home to roughly 700 languages, which amounts to about ten percent of the
global total, positioning it as the second-most linguistically diverse country after Papua New …

被引用次数：4 相关文章

[PDF] arxiv.org

Cross-lingual cross-age group adaptation for low-resource elderly speech emotion recognition

S Cahyawijaya, H Lovenia, W Chung, R Frieske… - arXiv preprint arXiv …, 2023 - arxiv.org

Speech emotion recognition plays a crucial role in human-computer interactions. However,
most speech emotion recognition research is biased toward English-speaking adults, which …

被引用次数：3 相关文章所有 4 个版本

Extraction and attribution of public figures statements for journalism in Indonesia using deep learning

YSP WP, YJ Kumar, NZ Zulkarnain, B Raza - Knowledge-Based Systems, 2024 - Elsevier

News articles are usually written by journalists based on statements taken from interviews
with public figures. Attribution from such statements provides important information and it …

被引用次数：1 相关文章

[PDF] unair.ac.id

Text Stemming and Lemmatization of Regional Languages in Indonesia: A Systematic Literature Review

Z Abidin, A Junaidi - Journal of Information Systems …, 2024 - e-journal.unair.ac.id

Background: Stemming is significantly essential in natural language processing (NLP) due
to the ability to minimize word variations to fundamental forms. This procedure facilitates the …

高级搜索

QQ 群

NusaCrowd: Open source initiative for Indonesian NLP resources

Aya model: An instruction finetuned open-access multilingual language model

Multilingual large language model: A survey of resources, taxonomy and frontiers

Aya dataset: An open-access collection for multilingual instruction tuning

GlobalBench: A benchmark for global progress in natural language processing

Findings of the 2023 ml-superb challenge: Pre-training and evaluation over more languages and beyond

Leveraging the Multilingual Indonesian Ethnic Languages Dataset In Self-Supervised Models for Low-Resource ASR Task

Cross-lingual cross-age group adaptation for low-resource elderly speech emotion recognition

Extraction and attribution of public figures statements for journalism in Indonesia using deep learning

Text Stemming and Lemmatization of Regional Languages in Indonesia: A Systematic Literature Review

引用