Masked language modeling and the distributional hypothesis: Order word matters pre-training for little

K Sinha, R Jia, D Hupkes, J Pineau, A Williams… - arXiv preprint arXiv …, 2021 - arxiv.org
A possible explanation for the impressive performance of masked language model (MLM)
pre-training is that such models have learned to represent the syntactic structures prevalent …

Natural language processing advancements by deep learning: A survey

A Torfi, RA Shirvani, Y Keneshloo, N Tavaf… - arXiv preprint arXiv …, 2020 - arxiv.org
Natural Language Processing (NLP) helps empower intelligent machines by enhancing a
better understanding of the human language for linguistic-based human-computer …

Semantic probabilistic layers for neuro-symbolic learning

K Ahmed, S Teso, KW Chang… - Advances in …, 2022 - proceedings.neurips.cc
We design a predictive layer for structured-output prediction (SOP) that can be plugged into
any neural network guaranteeing its predictions are consistent with a set of predefined …

How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven

RT McCoy, P Smolensky, T Linzen, J Gao… - Transactions of the …, 2023 - direct.mit.edu
Current language models can generate high-quality text. Are they simply copying text they
have seen before, or have they learned generalizable linguistic abstractions? To tease apart …

Automated concatenation of embeddings for structured prediction

X Wang, Y Jiang, N Bach, T Wang, Z Huang… - arXiv preprint arXiv …, 2020 - arxiv.org
Pretrained contextualized embeddings are powerful word representations for structured
prediction tasks. Recent work found that better word representations can be obtained by …

Disentangling syntax and semantics in the brain with deep networks

C Caucheteux, A Gramfort… - … conference on machine …, 2021 - proceedings.mlr.press
The activations of language transformers like GPT-2 have been shown to linearly map onto
brain activity during speech comprehension. However, the nature of these activations …

SynGEC: Syntax-enhanced grammatical error correction with a tailored GEC-oriented parser

Y Zhang, B Zhang, Z Li, Z Bao, C Li… - arXiv preprint arXiv …, 2022 - arxiv.org
This work proposes a syntax-enhanced grammatical error correction (GEC) approach
named SynGEC that effectively incorporates dependency syntactic information into the …

Fast and accurate neural CRF constituency parsing

Y Zhang, H Zhou, Z Li - arXiv preprint arXiv:2008.03736, 2020 - arxiv.org
Estimating probability distribution is one of the core issues in the NLP field. However, in both
deep learning (DL) and pre-DL eras, unlike the vast applications of linear-chain CRF in …

Nested named entity recognition as latent lexicalized constituency parsing

C Lou, S Yang, K Tu - arXiv preprint arXiv:2203.04665, 2022 - arxiv.org
Nested named entity recognition (NER) has been receiving increasing attention.
Recently,(Fu et al, 2021) adapt a span-based constituency parser to tackle nested NER …

Fusing heterogeneous factors with triaffine mechanism for nested named entity recognition

Z Yuan, C Tan, S Huang, F Huang - arXiv preprint arXiv:2110.07480, 2021 - arxiv.org
Nested entities are observed in many domains due to their compositionality, which cannot
be easily recognized by the widely-used sequence labeling framework. A natural solution is …