IndoLEM and IndoBERT: A benchmark dataset and pre-trained language model for Indonesian NLP

F Koto, A Rahimi, JH Lau, T Baldwin - arXiv preprint arXiv:2011.00677, 2020 - arxiv.org
Although the Indonesian language is spoken by almost 200 million people and the 10th
most spoken language in the world, it is under-represented in NLP research. Previous work …

One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

AF Aji, GI Winata, F Koto, S Cahyawijaya… - arXiv preprint arXiv …, 2022 - arxiv.org
NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …

Named-entity recognition for indonesian language using bidirectional lstm-cnns

W Gunawan, D Suhartono, F Purnomo… - Procedia Computer …, 2018 - Elsevier
In this paper, we describe the implementation of Named-Entity Recognition (NER) for
Indonesian Language by using various deep learning approaches, yet mainly focused on …

Indonesian named-entity recognition for 15 classes using ensemble supervised learning

AS Wibawa, A Purwarianti - Procedia Computer Science, 2016 - Elsevier
Here, we describe our effort in building Indonesian Named Entity Recognition (NER) for
newspaper article with 15 classes which is larger number of class type compared to existing …

A literature review of question answering system using named entity recognition

R Wongso, D Suhartono - 2016 3rd international conference …, 2016 - ieeexplore.ieee.org
Named Entity Recognition (NER) is well-known as the core component of question
answering system. NER has traditionally been developed as a component for information …

Investigating bi-lstm and crf with pos tag embedding for indonesian named entity tagger

D Hoesen, A Purwarianti - 2018 International Conference on …, 2018 - ieeexplore.ieee.org
Researches on Indonesian named entity (NE) tagger have been conducted since years ago
but without using deep learning. Most researches employed traditional machine learning …

Rule-based crime information extraction on Indonesian digital news

F Rahma, A Romadhony - … Conference on Data Science and Its …, 2021 - ieeexplore.ieee.org
Crime Information Extraction is a task to extract some entities in the crime domain. Previous
researchers have studied this task using rules to extract some crime entities in the English …

A rule-based named-entity recognition for malay articles

R Alfred, LC Leong, CK On, P Anthony, TS Fun… - Advanced Data Mining …, 2013 - Springer
Abstract A Named-Entity Recognition (NER) is part of the process in Text Mining used for
information extraction. This NER tool can be used to assist user in identifying and detecting …

Towards a standardized dataset on Indonesian named entity recognition

SO Khairunnisa, A Imankulova… - Proceedings of the 1st …, 2020 - aclanthology.org
In recent years, named entity recognition (NER) tasks in the Indonesian language have
undergone extensive development. There are only a few corpora for Indonesian NER; …

A semi-supervised algorithm for Indonesian named entity recognition

RA Leonandya, B Distiawan… - 2015 3rd international …, 2015 - ieeexplore.ieee.org
Named Entity Recognition or NER is one of the sub-research field of Information Extraction
which can be used for machine translation, question answering, semantic web, etc. One of …