Z Li,
X Ding, K Liao, B Qin, T Liu - arXiv preprint arXiv:2107.09852, 2021 - arxiv.org
Recent work has shown success in incorporating pre-trained models like BERT to improve
NLP systems. However, existing pre-trained models lack of causal knowledge which …