One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

AF Aji, GI Winata, F Koto, S Cahyawijaya… - arXiv preprint arXiv …, 2022 - arxiv.org
NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …

Cili: the collaborative interlingual index

F Bond, P Vossen, JP McCrae… - Proceedings of the 8th …, 2016 - aclanthology.org
This paper introduces the motivation for and design of the Collaborative InterLingual Index
(CILI). It is designed to make possible coordination between multiple loosely coupled …

Sentiment analysis for low resource languages: A study on informal Indonesian tweets

TA Le, D Moeljadi, Y Miura… - Proceedings of the 12th …, 2016 - aclanthology.org
This paper describes our attempt to build a sentiment analysis system for Indonesian tweets.
With this system, we can study and identify sentiments and opinions in a text or document …

Automatic Indonesian sentiment lexicon curation with sentiment valence tuning for social media sentiment analysis

R Wijayanti, A Arisal - ACM Transactions on Asian and Low-Resource …, 2021 - dl.acm.org
A novel Indonesian sentiment lexicon (SentIL--Sentiment Indonesian Lexicon) is created
with an automatic pipeline; from creating sentiment seed words, adding new words with …

Multi-Class Document Classification Using Lexical Ontology-Based Deep Learning

I Yelmen, A Gunes, M Zontul - Applied Sciences, 2023 - mdpi.com
With the recent growth of the Internet, the volume of data has also increased. In particular,
the increase in the amount of unstructured data makes it difficult to manage data …

Kaji terap kecerdasan buatan di Badan Pengkajian dan Penerapan Teknologi

H Riza, AS Nugroho - Jurnal Sistem Cerdas, 2020 - apic.id
BPPT mulai melakukan penelitian dan pengembangan di bidang kecerdasan buatan sejak
tahun 1987 yaitu dengan keterlibatannya dalam proyek sistem mesin penerjemah multi …

New instances classification framework on Quran ontology applied to question answering system

FS Utomo, N Suryana, MS Azmi - … Computing Electronics and …, 2019 - telkomnika.uad.ac.id
Instances classification with the small dataset for Quran ontology is the current research
problem which appears in Quran ontology development. The existing classification …

Abui Wordnet: Using a Toolbox Dictionary to develop a wordnet for a low-resource language

F Kratochvíl, LM Da Costa - Proceedings of the first workshop on …, 2022 - aclanthology.org
This paper describes a procedure to link a Toolbox dictionary of a low-resource language to
correct synsets, generating a new wordnet. We introduce a bootstrapping technique utilising …

Building an HPSG-based Indonesian resource grammar (INDRA)

D Moeljadi, F Bond, S Song - 2015 - dr.ntu.edu.sg
This paper presents the creation and the initial stage development of a broad-coverage
Indonesian Resource Grammar (INDRA) within the framework of Head Driven Phrase …

Improved transcription and speaker identification system for concurrent speech in Bahasa Indonesia using recurrent neural network

MB Andra, T Usagawa - IEEE Access, 2021 - ieeexplore.ieee.org
Bahasa Indonesia is one of the most prominent low-resource Languages that still lack
development in regards to communication-assisting technology. This paper proposes an …