The struggle of social media platforms to moderate content in a timely manner, encourages users to abuse such platforms to spread vulgar or abusive language, which, when …
We present the findings of SemEval-2022 Task 11 on Multilingual Complex Named Entity Recognition MULTICONER. Divided into 13 tracks, the task focused on methods to identify …
General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language …
Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we …
As the capabilities of language models continue to advance, it is conceivable that “one-size- fits-all” model will remain as the main paradigm. For instance, given the vast number of …
R Joshi - arXiv preprint arXiv:2211.11418, 2022 - arxiv.org
The monolingual Hindi BERT models currently available on the model hub do not perform better than the multi-lingual models on downstream tasks. We present L3Cube-HindBERT, a …
Natural language generation (NLG) benchmarks provide an important avenue to measure progress and develop better NLG systems. Unfortunately, the lack of publicly available NLG …
We train several language models for Icelandic, including IceBERT, that achieve state-of-the- art performance in a variety of downstream tasks, including part-of-speech tagging, named …
This paper presents a computational approach for creating a dataset on communal violence in the context of Bangladesh and West Bengal of India and benchmark evaluation. In recent …