The bigscience roots corpus: A 1.6 tb composite multilingual dataset

H Laurençon, L Saulnier, T Wang… - Advances in …, 2022 - proceedings.neurips.cc
As language models grow ever larger, the need for large-scale high-quality text datasets has
never been more pressing, especially in multilingual settings. The BigScience workshop, a 1 …

A novel data and model centric artificial intelligence based approach in developing high-performance named entity recognition for bengali language

KA Lima, K Md Hasib, S Azam, A Karim, S Montaha… - Plos one, 2023 - journals.plos.org
Named Entity Recognition (NER) plays a significant role in enhancing the performance of all
types of domain specific applications in Natural Language Processing (NLP). According to …

Context-Aware auto-encoded graph neural model for dynamic question generation using NLP

S Dara, CH Srinivasulu, CHM Babu, A Ravuri… - ACM transactions on …, 2023 - dl.acm.org
Question generation is an important task in natural language processing that involves
generating questions from a given text. This paper proposes a novel approach for dynamic …

A survey on NLP resources, tools, and techniques for Marathi language processing

P Lahoti, N Mittal, G Singh - ACM Transactions on Asian and Low …, 2022 - dl.acm.org
Natural Language Processing (NLP) has been in practice for the past couple of decades,
and extensive work has been done for the Western languages, particularly the English …

Reading comprehension based question answering system in Bangla language with transformer-based learning

TT Aurpa, RK Rifat, MS Ahmed, MM Anwar, ABMS Ali - Heliyon, 2022 - cell.com
Question answering (QA) system in any language is an assortment of mechanisms for
obtaining answers to user questions with various data compositions. Reading …

A survey of deep learning techniques for machine reading comprehension

S Kazi, S Khoja, A Daud - Artificial Intelligence Review, 2023 - Springer
Reading comprehension involves the process of reading and understanding textual
information in order to answer questions related to it. It finds practical applications in various …

[HTML][HTML] Transformer based answer-aware bengali question generation

JF Ruma, TT Mayeesha, RM Rahman - International Journal of Cognitive …, 2023 - Elsevier
Question generation (QG), the task of generating questions from text or other forms of data, a
significant and challenging subject, has recently attracted more attention in natural language …

Slovak dataset for multilingual question answering

D Hládek, J Staš, J Juhár, T Koctúr - IEEE Access, 2023 - ieeexplore.ieee.org
SK-QuAD is the first manually annotated dataset of questions and answers in Slovak. It
consists of more than 91k factual questions and answers from various fields. Each question …

Banglarqa: A benchmark dataset for under-resourced bangla language reading comprehension-based question answering with diverse question-answer types

SMS Ekram, AA Rahman, MS Altaf… - Findings of the …, 2022 - aclanthology.org
High-resource languages, such as English, have access to a plethora of datasets with
various question-answer types resembling real-world reading comprehension. However …

AraConv: Developing an Arabic task-oriented dialogue system using multi-lingual transformer model mT5

A Fuad, M Al-Yahya - Applied Sciences, 2022 - mdpi.com
Task-oriented dialogue systems (DS) are designed to help users perform daily activities
using natural language. Task-oriented DS for English language have demonstrated …