Statistical Analysis of Imbalanced Classification with Training Size Variation and Subsampling on Datasets of Research Papers in Biomedical Literature

J Dixon, M Rahman - Machine Learning and Knowledge Extraction, 2023 - mdpi.com
The overall purpose of this paper is to demonstrate how data preprocessing, training size
variation, and subsampling can dynamically change the performance metrics of imbalanced …

Improving imbalanced scientific text classification using sampling strategies and dictionaries

L Borrajo, R Romero, EL Iglesias… - Journal of integrative …, 2011 - degruyter.com
Many real applications have the imbalanced class distribution problem, where one of the
classes is represented by a very small number of cases compared to the other classes. One …

Building biomedical text classifiers under sample selection bias

R Romero, EL Iglesias, L Borrajo - International Symposium on Distributed …, 2011 - Springer
Scientific papers are a primary source of information for investigators to know the current
status in a topic or compare their results with other colleagues. However, mining biomedical …

[PDF][PDF] Comparative Analysis using Various Performance Metrics in Imbalanced Data for Multi-class Text Classification

S Riyanto, SS Imas, T Djatna… - International Journal of …, 2023 - pdfs.semanticscholar.org
Precision, Recall, and F1-score are metrics that are often used to evaluate model
performance. Precision and Recall are very important to consider when the data is …

Reducing the effect of imbalance in text classification using SVD and GloVe with ensemble and deep learning

T Hossain, HZ Mauni, R Rab - Computing and Informatics, 2022 - cai.sk
Due to the recent escalation in the amount of text data available and used online, text
classification has become a staple for data analysts when extracting relevant information …

Exploratory Study of Data Sampling Methods for Imbalanced Legal Text Classification

DL Freire, AMG de Almeida, M de S. Dias… - … Conference on Hybrid …, 2023 - Springer
This article investigates the application of machine learning algorithms in the legal domain,
focusing on text classification tasks and addressing the challenges posed by imbalanced …

[PDF][PDF] Analyzing the impact of resampling method for imbalanced data text in indonesian scientific articles categorization

A Indrawati, H Subagyo, A Sihombing… - Baca J. Dokumentasi …, 2020 - academia.edu
The extremely skewed data in artificial intelligence, machine learning, and data mining
cases are often given misleading results. It is caused because machine learning algorithms …

Comparative Multinomial Text Classification Analysis of Naïve Bayes and XGBoost with SMOTE on Imbalanced Dataset

A Chaturvedi, S Yadav, MAMH Ansari… - … Intelligence in Pattern …, 2022 - Springer
In supervised machine learning, with an imbalanced dataset, achieving better classification
in minority classes is a major challenge. In such situation, machine learning model shows …

Comparison of feature selection for imbalance text datasets

A Chandra - 2019 International Conference on Information …, 2019 - ieeexplore.ieee.org
The numbers of documents are increasing rapidly in a web format. Therefore, automatic
document classification is needed to help human to classify the documents. Text …

Emerging Trends in Classification with Imbalanced Datasets: A Bibliometric Analysis of Progression

A Maraş, Ç Erol - Bilişim Teknolojileri Dergisi, 2022 - dergipark.org.tr
Imbalanced or unbalanced datasets are defined as the highly skewed distribution of target
variable in the field of machine learning. Imbalanced datasets have greatly caught the …