[图书][B] AI in Education: Effective Machine Learning Methods To Improve Data Scarcity and Knowledge Generalization

JT Shen - 2023 - search.proquest.com
In education, machine learning (ML), especially deep learning (DL) in recent years, has
been extensively used to improve both teaching and learning. Despite the rapid …

Learning to augment for data-scarce domain bert knowledge distillation

L Feng, M Qiu, Y Li, HT Zheng, Y Shen - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Despite pre-trained language models such as BERT have achieved appealing performance
in a wide range of Natural Language Processing (NLP) tasks, they are computationally …

Mathbert: A pre-trained language model for general nlp tasks in mathematics education

JT Shen, M Yamashita, E Prihar, N Heffernan… - arXiv preprint arXiv …, 2021 - arxiv.org
Since the introduction of the original BERT (ie, BASE BERT), researchers have developed
various customized BERT models with improved performance for specific domains and tasks …

Which student is best? a comprehensive knowledge distillation exam for task-specific bert models

MN Nityasya, HA Wibowo, R Chevi, RE Prasojo… - arXiv preprint arXiv …, 2022 - arxiv.org
We perform knowledge distillation (KD) benchmark from task-specific BERT-base teacher
models to various student models: BiLSTM, CNN, BERT-Tiny, BERT-Mini, and BERT-Small …

[HTML][HTML] Automated Data Engineering for Deep Learning

M Zhao - 2023 - era.library.ualberta.ca
Due to the advancement of computational hardware and the abundance and variety of data,
deep learning has achieved significant success in natural language processing, computer …

3DG: a framework for using generative AI for handling sparse learner performance data from intelligent tutoring systems

L Zhang, J Lin, C Borchers, M Cao, X Hu - arXiv preprint arXiv:2402.01746, 2024 - arxiv.org
Learning performance data (eg, quiz scores and attempts) is significant for understanding
learner engagement and knowledge mastery level. However, the learning performance data …

Heterogeneous Student Knowledge Distillation From BERT Using a Lightweight Ensemble Framework

CS Lin, CN Tsai, JS Jwo, CH Lee, X Wang - IEEE Access, 2024 - ieeexplore.ieee.org
Deep learning models have demonstrated their effectiveness in capturing complex
relationships between input features and target outputs across many different application …

Classifying math knowledge components via task-adaptive pre-trained BERT

JT Shen, M Yamashita, E Prihar, N Heffernan… - Artificial Intelligence in …, 2021 - Springer
Educational content labeled with proper knowledge components (KCs) are particularly
useful to teachers or content organizers. However, manually labeling educational content is …

KI-BERT: Infusing knowledge context for better language and domain understanding

K Faldu, A Sheth, P Kikani, H Akbari - arXiv preprint arXiv:2104.08145, 2021 - arxiv.org
Contextualized entity representations learned by state-of-the-art transformer-based
language models (TLMs) like BERT, GPT, T5, etc., leverage the attention mechanism to …

SKDBERT: Compressing BERT via stochastic knowledge distillation

Z Ding, G Jiang, S Zhang, L Guo, W Lin - Proceedings of the AAAI …, 2023 - ojs.aaai.org
In this paper, we propose Stochastic Knowledge Distillation (SKD) to obtain compact BERT-
style language model dubbed SKDBERT. In each distillation iteration, SKD samples a …