Learning class-specific word embeddings

S Kuang, BD Davison - The Journal of Supercomputing, 2020 - Springer
Recent years have seen the success of applying word embedding algorithms to natural
language processing (NLP) tasks. Most word embedding algorithms only produce a single …

A comparative text classification study with deep learning-based algorithms

Ö Köksal, Ö Akgül - 2022 9th International Conference on …, 2022 - ieeexplore.ieee.org
As a well-known Natural Language Processing (NLP) task, text classification can be defined
as the process of categorizing documents depending on their content. In this process …

Dict2vec: Learning word embeddings using lexical dictionaries

J Tissier, C Gravier, A Habrard - Conference on Empirical Methods …, 2017 - ujm.hal.science
Learning word embeddings on large unla-beled corpus has been shown to be successful in
improving many natural language tasks. The most efficient and popular approaches learn or …

[HTML][HTML] A robust morpheme sequence and convolutional neural network-based Uyghur and Kazakh short text classification

S Parhat, M Ablimit, A Hamdulla - Information, 2019 - mdpi.com
In this paper, based on the multilingual morphological analyzer, we researched the similar
low-resource languages, Uyghur and Kazakh, short text classification. Generally, the online …

New techniques for Arabic document classification

HKH Chantar - 2013 - ros.hw.ac.uk
Text classification (TC) concerns automatically assigning a class (category) label to a text
document, and has increasingly many applications, particularly in the domain of organizing …

Exploring Character Trigrams for Robust Arabic Text Classification: A Comparative Analysis in the Face of Vocabulary Expansion and Misspelled Words

D Alomari, I Ahmad - IEEE Access, 2024 - ieeexplore.ieee.org
Tokenization is an important early step in natural language processing (NLP) tasks. The
idea is to split the input sentence into smaller units, called tokens, for further processing …

The impact of deep learning on document classification using semantically rich representations

Z Kastrati, AS Imran, SY Yayilgan - Information Processing & Management, 2019 - Elsevier
This paper presents a semantically rich document representation model for automatically
classifying financial documents into predefined categories utilizing deep learning. The …

Using word embeddings for Italian crime news categorization

G Bonisoli, F Rollo, L Po - 2021 16th Conference on Computer …, 2021 - ieeexplore.ieee.org
Several studies have shown that the use of embeddings improves outcomes in many
Natural Language Processing (NLP) activities, including text categorization. This paper …

Combining Bag-of-Words and Bag-of-Concepts representations for Arabic text classification

This paper introduces a set of new approaches for text representation for automatic
classification of Arabic textual documents. These approaches are based on combining the …

On the importance of tokenization in Arabic embedding models

M Alkaoud, M Syed - Proceedings of the fifth Arabic natural …, 2020 - aclanthology.org
Arabic, like other highly inflected languages, encodes a large amount of information in its
morphology and word structure. In this work, we propose two embedding strategies that …