Should you mask 15% in masked language modeling?

A Wettig, T Gao, Z Zhong, D Chen - arXiv preprint arXiv:2202.08005, 2022 - arxiv.org
Masked language models (MLMs) conventionally mask 15% of tokens due to the belief that
more masking would leave insufficient context to learn good representations; this masking …

Emerging property of masked token for effective pre-training

H Choi, H Lee, S Joung, H Park, J Kim… - European Conference on …, 2024 - Springer
Driven by the success of Masked Language Modeling (MLM), the realm of self-supervised
learning for computer vision has been invigorated by the central role of Masked Image …

Parameter-efficient fine-tuning without introducing new latency

B Liao, Y Meng, C Monz - arXiv preprint arXiv:2305.16742, 2023 - arxiv.org
Parameter-efficient fine-tuning (PEFT) of pre-trained language models has recently
demonstrated remarkable achievements, effectively matching the performance of full fine …

Make pre-trained model reversible: From parameter to memory efficient fine-tuning

B Liao, S Tan, C Monz - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Parameter-efficient fine-tuning (PEFT) of pre-trained language models (PLMs) has emerged
as a highly successful approach, with training only a small number of parameters without …

Apiq: Finetuning of 2-bit quantized large language model

B Liao, C Herold, S Khadivi, C Monz - arXiv preprint arXiv:2402.05147, 2024 - arxiv.org
Memory-efficient finetuning of large language models (LLMs) has recently attracted huge
attention with the increasing size of LLMs, primarily due to the constraints posed by GPU …

Representation deficiency in masked language modeling

Y Meng, J Krishnan, S Wang, Q Wang, Y Mao… - arXiv preprint arXiv …, 2023 - arxiv.org
Masked Language Modeling (MLM) has been one of the most prominent approaches for
pretraining bidirectional text encoders due to its simplicity and effectiveness. One notable …

ExLM: Rethinking the Impact of Tokens in Masked Language Models

K Zheng, J Yang, S Liang, B Feng, Z Liu, W Ju… - arXiv preprint arXiv …, 2025 - arxiv.org
Masked Language Models (MLMs) have achieved remarkable success in many self-
supervised representation learning tasks. MLMs are trained by randomly replacing some …

Domain Adaptation of Named Entity Recognition for Plant Health Monitoring

M Borovikova - 2024 - theses.hal.science
The increasing complexity of agricultural ecosystems and the urgent need for effective plant
health monitoring necessitate advanced technological solutions for processing textual data …