Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining

K Nowakowski, M Ptaszynski, K Murasaki… - Information Processing …, 2023 - Elsevier
In recent years, neural models learned through self-supervised pretraining on large scale
multilingual text or speech data have exhibited promising results for underresourced …

Machine Translation for Highly Low-Resource Language: A Case Study of Ainu, a Critically Endangered Indigenous Language in Northern Japan

S Miyagawa - Proceedings of the Joint 3rd International …, 2023 - aclanthology.org
This paper explores the potential of Machine Translation (MT) in preserving and revitalizing
Ainu, an indigenous language of Japan classified as critically endangered by UNESCO …

Ainu–Japanese Bi-directional Neural Machine Translation: A Step Towards Linguistic Preservation of Ainu, An Under-Resourced Indigenous Language in Japan

S Miyagawa - Journal of Data Mining & Digital Humanities, 2024 - jdmdh.episciences.org
This study presents a groundbreaking approach to preserving the Ainu language,
recognized as critically endangered by UNESCO, by developing a bi-directional neural …

[HTML][HTML] Development of a digital corpus and core language technologies for the Ainu language

N KAROL PIOTR - 2020 - kitami-it.repo.nii.ac.jp
Technologies being developed within the field of Natural Language Processing (NLP) have
an important role to play in the urgent tasks of documenting, analyzing and revitalizing …

Towards better text processing tools for the Ainu language

K Nowakowski, M Ptaszynski, F Masui - Language and Technology …, 2017 - Springer
In this paper we present our research devoted to the development of Natural Language
Processing technologies for the Ainu language, a critically endangered language isolate …

An Empirical Study on Efficiency of a Dictionary Based Viterbi Algorithm for Word Segmentation

S Aggarwal, S Houshmand… - … Conference on Big …, 2020 - ieeexplore.ieee.org
In this paper we present an algorithm for segmenting English sentences, without spaces,
into their constituent words based on a dictionary using a variation of the Viterbi algorithm …

PRACTICAL APPLICATION OF SELECTED DATABASE SYSTEMS IN NATURAL LANGUAGE PROCESSING

M Skublewska-Paszkowska, T Zientarski - INTED2020 Proceedings, 2020 - library.iated.org
According to the 2017 curriculum, the database is one of the key components of Information
Technology knowledge area. First-and second-level students are taught the theory as well …

[PDF][PDF] アイヌ語訳『五倫名義解』 Universal Dependencies 並行コーパスへの挑戦

安岡孝一, 安岡素子 - 東洋学へのコンピュータ利用第36 …, 2023 - repository.kulib.kyoto-u.ac.jp
加賀家文書館 (別海町) 所蔵のアイヌ語訳 『五倫名義解』(整理番号 K3-21) は, 空谷茂潤
『五倫名義解』[2] を元に, 加賀伝蔵がアイヌ語訳を施したもので, 文久~ 慶応年間に書かれたとされ …