On the role of supervision in unsupervised constituency parsing

L Zhang, Z Yang, D Yang - arXiv preprint arXiv:2205.06153, 2022 - arxiv.org

Data augmentation is an effective approach to tackle over-fitting. Many previous works have
proposed different data augmentations strategies for NLP, such as noise injection, word …

被引用次数：27 相关文章所有 6 个版本

[PDF] arxiv.org

Substructure substitution: Structured data augmentation for NLP

H Shi, K Livescu, K Gimpel - arXiv preprint arXiv:2101.00411, 2021 - arxiv.org

We study a family of data augmentation methods, substructure substitution (SUB2), for
natural language processing (NLP) tasks. SUB2 generates new examples by substituting …

被引用次数：47 相关文章所有 4 个版本

[PDF] arxiv.org

Audio-visual neural syntax acquisition

CIJ Lai, F Shi, P Peng, Y Kim, K Gimpel… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

We study phrase structure induction from visually-grounded speech. The core idea is to first
segment the speech waveform into sequences of word segments, and subsequently induce …

被引用次数：4 相关文章所有 7 个版本

[PDF] arxiv.org

Unsupervised chunking with hierarchical RNN

Z Wu, AA Deshmukh, Y Wu, J Lin, L Mou - arXiv preprint arXiv:2309.04919, 2023 - arxiv.org

In Natural Language Processing (NLP), predicting linguistic structures, such as parsing and
chunking, has mostly relied on manual annotations of syntactic structures. This paper …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

PCFGs can do better: Inducing probabilistic context-free grammars with many symbols

S Yang, Y Zhao, K Tu - arXiv preprint arXiv:2104.13727, 2021 - arxiv.org

Probabilistic context-free grammars (PCFGs) with neural parameterization have been shown
to be effective in unsupervised phrase-structure grammar induction. However, due to the …

被引用次数：22 相关文章所有 5 个版本

[PDF] arxiv.org

Heads-up! unsupervised constituency parsing via self-attention heads

B Li, T Kim, RK Amplayo, F Keller - arXiv preprint arXiv:2010.09517, 2020 - arxiv.org

Transformer-based pre-trained language models (PLMs) have dramatically improved the
state of the art in NLP across many tasks. This has led to substantial interest in analyzing the …

被引用次数：15 相关文章所有 6 个版本

[PDF] arxiv.org

Data augmentation for machine translation via dependency subtree swapping

A Nagy, DP Lakatos, B Barta, P Nanys, J Ács - arXiv preprint arXiv …, 2023 - arxiv.org

We present a generic framework for data augmentation via dependency subtree swapping
that is applicable to machine translation. We extract corresponding subtrees from the …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Learning a grammar inducer from massive uncurated instructional videos

S Zhang, L Song, L Jin, H Mi, K Xu, D Yu… - arXiv preprint arXiv …, 2022 - arxiv.org

Video-aided grammar induction aims to leverage video information for finding more accurate
syntactic grammars for accompanying text. While previous work focuses on building systems …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Revisiting the practical effectiveness of constituency parse extraction from pre-trained language models

T Kim - arXiv preprint arXiv:2211.00479, 2022 - arxiv.org

Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent
paradigm that attempts to induce constituency parse trees relying only on the internal …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Unsupervised discontinuous constituency parsing with mildly context-sensitive grammars

S Yang, RP Levy, Y Kim - arXiv preprint arXiv:2212.09140, 2022 - arxiv.org

We study grammar induction with mildly context-sensitive grammars for unsupervised
discontinuous parsing. Using the probabilistic linear context-free rewriting system (LCFRS) …

被引用次数：5 相关文章所有 5 个版本

高级搜索

QQ 群