Abstract Knowledge distillation can effectively transfer knowledge from BERT, a deep language representation model, to traditional, shallow word embedding-based neural …
Abstract Knowledge distillation can effectively transfer knowledge from BERT, a deep language representation model, to traditional, shallow word embedding-based neural …
R Tang, Y Lu, J Lin - EMNLP-IJCNLP 2019, 2019 - aclanthology.org
Abstract Knowledge distillation can effectively transfer knowledge from BERT, a deep language representation model, to traditional, shallow word embedding-based neural …