NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji… - Findings of the …, 2023 - aclanthology.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

Hrs-bench: Holistic, reliable and scalable benchmark for text-to-image models

EM Bakr, P Sun, X Shen, FF Khan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Designing robust text-to-image (T2I) models have been extensively explored in recent years,
especially with the emergence of diffusion models, which achieves state-of-the-art results on …

Learning disentangled textual representations via statistical measures of similarity

P Colombo, G Staerman, N Noiry… - arXiv preprint arXiv …, 2022 - arxiv.org
When working with textual data, a natural application of disentangled representations is fair
classification where the goal is to make predictions without being biased (or influenced) by …

Can Pretrained Language Models (Yet) Reason Deductively?

Z Yuan, S Hu, I Vulić, A Korhonen, Z Meng - arXiv preprint arXiv …, 2022 - arxiv.org
Acquiring factual knowledge with Pretrained Language Models (PLMs) has attracted
increasing attention, showing promising performance in many knowledge-intensive tasks …

PreQuEL: Quality estimation of machine translation outputs in advance

S Don-Yehiya, L Choshen, O Abend - arXiv preprint arXiv:2205.09178, 2022 - arxiv.org
We present the task of PreQuEL, Pre-(Quality-Estimation) Learning. A PreQuEL system
predicts how well a given sentence will be translated, without recourse to the actual …

Multimodal robustness for neural machine translation

Y Zhao, I Calapodescu - Proceedings of the 2022 conference on …, 2022 - aclanthology.org
In this paper, we look at the case of a Generic text-to-text NMT model that has to deal with
data coming from various modalities, like speech, images, or noisy text extracted from the …

Every picture tells a story: Image-grounded controllable stylistic story generation

H Lovenia, B Wilie, R Barraud, S Cahyawijaya… - arXiv preprint arXiv …, 2022 - arxiv.org
Generating a short story out of an image is arduous. Unlike image captioning, story
generation from an image poses multiple challenges: preserving the story coherence …

Multimodal conversation modelling for topic derailment detection

Z Li, M Rei, L Specia - Findings of the Association for …, 2022 - aclanthology.org
Conversations on social media tend to go off-topic and turn into different and sometimes
toxic exchanges. Previous work focuses on analysing textual dialogues that have derailed …

Can Question Rewriting Help Conversational Question Answering?

E Ishii, Y Xu, S Cahyawijaya, B Wilie - arXiv preprint arXiv:2204.06239, 2022 - arxiv.org
Question rewriting (QR) is a subtask of conversational question answering (CQA) aiming to
ease the challenges of understanding dependencies among dialogue history by …