Contemporary approaches in evolving language models

D Oralbekova, O Mamyrbayev, M Othman… - Applied Sciences, 2023 - mdpi.com
This article provides a comprehensive survey of contemporary language modeling
approaches within the realm of natural language processing (NLP) tasks. This paper …

Trans-tokenization and cross-lingual vocabulary transfers: Language adaptation of LLMs for low-resource NLP

F Remy, P Delobelle, H Avetisyan… - arXiv preprint arXiv …, 2024 - arxiv.org
The development of monolingual language models for low and mid-resource languages
continues to be hindered by the difficulty in sourcing high-quality training data. In this study …

Classification of scientific documents in the Kazakh language using deep neural networks and a fusion of images and text

A Bogdanchikov, D Ayazbayev, I Varlamis - Big Data and Cognitive …, 2022 - mdpi.com
The rapid development of natural language processing and deep learning techniques has
boosted the performance of related algorithms in several linguistic and text mining tasks …

A benchmark for evaluating Arabic word embedding models

S Yagi, A Elnagar, S Fareh - Natural Language Engineering, 2023 - cambridge.org
Modelling the distributional semantics of such a morphologically rich language as Arabic
needs to take into account its introflexive, fusional, and inflectional nature attributes that …

[HTML][HTML] Advancing Arabic Word Embeddings: A Multi-Corpora Approach with Optimized Hyperparameters and Custom Evaluation

A Allahim, A Cherif - Applied Sciences, 2024 - mdpi.com
The expanding Arabic user base presents a unique opportunity for researchers to tap into
vast online Arabic resources. However, the lack of reliable Arabic word embedding models …

Domain generalization using ensemble learning

Y Mesbah, YY Ibrahim, AM Khan - Intelligent Systems and Applications …, 2022 - Springer
Abstract Domain generalization is a sub-field of transfer learning that aims at bridging the
gap between two different domains in the absence of any knowledge about the target …

Deep learning models in software requirements engineering

M Naumcheva - arXiv preprint arXiv:2105.07771, 2021 - arxiv.org
Requirements elicitation is an important phase of any software project: the errors in
requirements are more expensive to fix than the errors introduced at later stages of software …

Cross-location activity recognition using adversarial learning

A Khattak, A Khan - Proceedings of the 11th International Symposium on …, 2022 - dl.acm.org
Human activity recognition (HAR) is an emerging field of study to recognize human
movement and actions from recorded data. It plays a significant role in human-computer …

Analyzing the effectiveness of image augmentations for face recognition from limited data

A Zhuchkov - 2021 International Conference" Nonlinearity …, 2021 - ieeexplore.ieee.org
This work presents an analysis of the effectiveness of image augmentations for the problem
of face recognition from limited data. We considered basic manipulations, generative …

TatarTTS: An Open-Source Text-to-Speech Synthesis Dataset for the Tatar Language

D Orel, A Kuzdeuov, R Gilmullin… - … in Information and …, 2024 - ieeexplore.ieee.org
This paper introduces an open-source dataset for speech synthesis in the Tatar language.
The dataset comprises approximately 70 hours of transcribed audio recordings, featuring …