Memorization without overfitting: Analyzing the training dynamics of large language models

K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc
Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …

An empirical exploration of curriculum learning for neural machine translation

X Zhang, G Kumar, H Khayrallah, K Murray… - arXiv preprint arXiv …, 2018 - arxiv.org
Machine translation systems based on deep neural networks are expensive to train.
Curriculum learning aims to address this issue by choosing the order in which samples are …

Self-supervised curriculum learning for spelling error correction

Z Gan, H Xu, H Zan - Proceedings of the 2021 Conference on …, 2021 - aclanthology.org
Abstract Spelling Error Correction (SEC) that requires high-level language understanding is
a challenging but useful task. Current SEC approaches normally leverage a pre-training …

[HTML][HTML] Dynamic data selection for curriculum learning via ability estimation

JP Lalor, H Yu - Proceedings of the Conference on Empirical …, 2020 - ncbi.nlm.nih.gov
Curriculum learning methods typically rely on heuristics to estimate the difficulty of training
examples or the ability of the model. In this work, we propose replacing difficulty heuristics …

Comparison of object detection methods for corn damage assessment using deep learning

A Hamidisepehr, SV Mirnezami… - Transactions of the …, 2020 - elibrary.asabe.org
Highlights Corn damage detection was possible using advanced deep learning and
computer vision techniques trained with images of simulated corn lodging. RetinaNet and …

Generic and trend-aware curriculum learning for relation extraction

N Vakil, H Amiri - Proceedings of the 2022 Conference of the …, 2022 - aclanthology.org
We present a generic and trend-aware curriculum learning approach that effectively
integrates textual and structural information in text graphs for relation extraction between …

Curriculum learning for graph neural networks: A multiview competence-based approach

N Vakil, H Amiri - arXiv preprint arXiv:2307.08859, 2023 - arxiv.org
A curriculum is a planned sequence of learning materials and an effective one can make
learning efficient and effective for both humans and machines. Recent studies developed …

Leitner-Guided Memory Replay for Cross-lingual Continual Learning

M M'hamdi, J May - Proceedings of the 2024 Conference of the …, 2024 - aclanthology.org
Cross-lingual continual learning aims to continuously fine-tune a downstream model on
emerging data from new languages. One major challenge in cross-lingual continual learning …

HuCurl: Human-induced curriculum discovery

M Elgaar, H Amiri - arXiv preprint arXiv:2307.07412, 2023 - arxiv.org
We introduce the problem of curriculum discovery and describe a curriculum learning
framework capable of discovering effective curricula in a curriculum space based on prior …

Learn the time to learn: Replay scheduling in continual learning

M Klasson, H Kjellström, C Zhang - arXiv preprint arXiv:2209.08660, 2022 - arxiv.org
Replay methods have shown to be successful in mitigating catastrophic forgetting in
continual learning scenarios despite having limited access to historical data. However …