Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arXiv preprint arXiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Survey of low-resource machine translation

B Haddow, R Bawden, AVM Barone, J Helcl… - Computational …, 2022 - direct.mit.edu
We present a survey covering the state of the art in low-resource machine translation (MT)
research. There are currently around 7,000 languages spoken in the world and almost all …

Scaling neural machine translation to 200 languages

NLLB Team - Nature, 2024 - pmc.ncbi.nlm.nih.gov
The development of neural techniques has opened up new avenues for research in
machine translation. Today, neural machine translation (NMT) systems can leverage highly …

A voyage on neural machine translation for Indic languages

SK Sheshadri, D Gupta, MR Costa-Jussà - Procedia Computer Science, 2023 - Elsevier
With the invention of deep learning concepts, Machine Translation (MT) migrated towards
Neural Machine Translation (NMT) architectures, eventually from Statistical Machine …

AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

J Choi, SJ Park, M Kim, YM Ro - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
This paper proposes a novel direct Audio-Visual Speech to Audio-Visual Speech
Translation (AV2AV) framework where the input and output of the system are multimodal (ie …

[PDF][PDF] Aksharantar: Towards building open transliteration tools for the next billion users

Y Madhani, S Parthan, P Bedekar, R Khapra… - arXiv preprint arXiv …, 2022 - academia.edu
We introduce Aksharantar, the largest publicly available transliteration dataset for 21 Indic
languages containing 26 million transliteration pairs. We build this dataset by mining …

English–Assamese neural machine translation using prior alignment and pre-trained language model

SR Laskar, B Paul, P Dadure, R Manna… - Computer Speech & …, 2023 - Elsevier
In a multilingual country like India, automatic natural language translation plays a key role in
building a community with different linguistic people. Many researchers have explored and …

Using natural language prompts for machine translation

X Garcia, O Firat - arXiv preprint arXiv:2202.11822, 2022 - arxiv.org
We explore the use of natural language prompts for controlling various aspects of the
outputs generated by machine translation models. We demonstrate that natural language …

Effectiveness of mining audio and text pairs from public data for improving ASR systems for low-resource languages

K Bhogale, A Raman, T Javed… - Icassp 2023-2023 …, 2023 - ieeexplore.ieee.org
Collecting labelled datasets for speech recognition systems for low-resource languages on
a diverse set of domains and speakers is expensive. In this work, we demonstrate an …

Aksharantar: Open Indic-language transliteration datasets and models for the next billion users

Y Madhani, S Parthan, P Bedekar, G Nc… - Findings of the …, 2023 - aclanthology.org
Transliteration is very important in the Indian language context due to the usage of multiple
scripts and the widespread use of romanized inputs. However, few training and evaluation …