Deep subjecthood: Higher-order grammatical features in multilingual BERT

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：3758 相关文章所有 2 个版本

[PDF] lingbuzz.net

[PDF][PDF] Modern language models refute Chomsky's approach to language

S Piantadosi - Lingbuzz Preprint, lingbuzz, 2023 - lingbuzz.net

The rise and success of large language models undermines virtually every strong claim for
the innateness of language that has been proposed by generative linguistics. Modern …

被引用次数：139 相关文章所有 3 个版本

[PDF] arxiv.org

Masked language modeling and the distributional hypothesis: Order word matters pre-training for little

K Sinha, R Jia, D Hupkes, J Pineau, A Williams… - arXiv preprint arXiv …, 2021 - arxiv.org

A possible explanation for the impressive performance of masked language model (MLM)
pre-training is that such models have learned to represent the syntactic structures prevalent …

被引用次数：241 相关文章所有 5 个版本

[PDF] arxiv.org

What artificial neural networks can tell us about human language acquisition

A Warstadt, SR Bowman - Algebraic structures in natural …, 2022 - taylorfrancis.com

Rapid progress in machine learning for natural language processing has the potential to
transform debates about how humans learn language. However, the learning environments …

被引用次数：97 相关文章所有 6 个版本

A discriminative account of the learning, representation and processing of inflection systems

M Ramscar - Language, Cognition and Neuroscience, 2023 - Taylor & Francis

What kind of knowledge accounts for linguistic productivity? How is it acquired? For years,
debate on these questions has focused on a seemingly obscure domain: inflectional …

被引用次数：20 相关文章所有 4 个版本

[PDF] aclanthology.org

Word order does matter and shuffled language models know it

M Abdou, V Ravishankar, A Kulmizev… - Proceedings of the 60th …, 2022 - aclanthology.org

Recent studies have shown that language models pretrained and/or fine-tuned on randomly
permuted sentences exhibit competitive performance on GLUE, putting into question the …

被引用次数：43 相关文章所有 3 个版本

[PDF] arxiv.org

Probing for the usage of grammatical number

K Lasri, T Pimentel, A Lenci, T Poibeau… - arXiv preprint arXiv …, 2022 - arxiv.org

A central quest of probing is to uncover how pre-trained models encode a linguistic property
within their representations. An encoding, however, might be spurious-ie, the model might …

被引用次数：49 相关文章所有 10 个版本

[PDF] arxiv.org

SemAttack: Natural textual attacks via different semantic spaces

B Wang, C Xu, X Liu, Y Cheng, B Li - arXiv preprint arXiv:2205.01287, 2022 - arxiv.org

Recent studies show that pre-trained language models (LMs) are vulnerable to textual
adversarial attacks. However, existing attack methods either suffer from low attack success …

被引用次数：39 相关文章所有 6 个版本

[PDF] arxiv.org

Pretraining with artificial language: Studying transferable knowledge in language models

R Ri, Y Tsuruoka - arXiv preprint arXiv:2203.10326, 2022 - arxiv.org

We investigate what kind of structural knowledge learned in neural network encoders is
transferable to processing natural language. We design artificial languages with structural …

被引用次数：33 相关文章所有 5 个版本

[PDF] arxiv.org

When classifying grammatical role, BERT doesn't care about word order... except when it matters

I Papadimitriou, R Futrell, K Mahowald - arXiv preprint arXiv:2203.06204, 2022 - arxiv.org

Because meaning can often be inferred from lexical semantics alone, word order is often a
redundant cue in natural language. For example, the words chopped, chef, and onion are …

被引用次数：25 相关文章所有 6 个版本

高级搜索

QQ 群