Colorless green recurrent networks dream hierarchically

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com

Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

被引用次数：289 相关文章所有 10 个版本

[PDF] arxiv.org

Recent advances in natural language processing via large pre-trained language models: A survey

B Min, H Ross, E Sulem, APB Veyseh… - ACM Computing …, 2023 - dl.acm.org

Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …

被引用次数：659 相关文章所有 5 个版本

[PDF] arxiv.org

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arXiv preprint arXiv …, 2022 - arxiv.org

Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

被引用次数：883 相关文章所有 11 个版本

[PDF] thecvf.com

Winoground: Probing vision and language models for visio-linguistic compositionality

T Thrush, R Jiang, M Bartolo, A Singh… - Proceedings of the …, 2022 - openaccess.thecvf.com

We present a novel task and dataset for evaluating the ability of vision and language models
to conduct visio-linguistic compositional reasoning, which we call Winoground. Given two …

被引用次数：280 相关文章所有 6 个版本

[PDF] arxiv.org

A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arXiv preprint arXiv …, 2021 - arxiv.org

Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

被引用次数：762 相关文章所有 9 个版本

[PDF] lingbuzz.net

[PDF][PDF] Modern language models refute Chomsky's approach to language

S Piantadosi - Lingbuzz Preprint, lingbuzz, 2023 - lingbuzz.net

The rise and success of large language models undermines virtually every strong claim for
the innateness of language that has been proposed by generative linguistics. Modern …

被引用次数：120 相关文章所有 3 个版本

[PDF] arxiv.org

What does bert look at? an analysis of bert's attention

K Clark, U Khandelwal, O Levy, CD Manning - arXiv preprint arXiv …, 2019 - arxiv.org

Large pre-trained neural networks such as BERT have had great recent success in NLP,
motivating a growing body of research investigating what aspects of language they are able …

被引用次数：1681 相关文章所有 11 个版本

[PDF] arxiv.org

Masked language modeling and the distributional hypothesis: Order word matters pre-training for little

K Sinha, R Jia, D Hupkes, J Pineau, A Williams… - arXiv preprint arXiv …, 2021 - arxiv.org

A possible explanation for the impressive performance of masked language model (MLM)
pre-training is that such models have learned to represent the syntactic structures prevalent …

被引用次数：229 相关文章所有 5 个版本

[PDF] aclanthology.org

Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned

E Voita, D Talbot, F Moiseev, R Sennrich… - arXiv preprint arXiv …, 2019 - arxiv.org

Multi-head self-attention is a key component of the Transformer, a state-of-the-art
architecture for neural machine translation. In this work we evaluate the contribution made …

被引用次数：1121 相关文章所有 10 个版本

[PDF] aclanthology.org

A structural probe for finding syntax in word representations

J Hewitt, CD Manning - Proceedings of the 2019 Conference of …, 2019 - aclanthology.org

Recent work has improved our ability to detect linguistic knowledge in word representations.
However, current methods for detecting syntactic knowledge do not test whether syntax trees …

被引用次数：1144 相关文章所有 6 个版本

高级搜索

QQ 群