DisCoDisCo at the DISRPT2021 shared task: A system for discourse segmentation, classification, and connective detection

L Gessler, S Behzad, YJ Liu, S Peng, Y Zhu… - arXiv preprint arXiv …, 2021 - arxiv.org
This paper describes our submission to the DISRPT2021 Shared Task on Discourse Unit
Segmentation, Connective Detection, and Relation Classification. Our system, called …

Mention detection in coreference resolution: survey

K Lata, P Singh, K Dutta - Applied Intelligence, 2022 - Springer
Coreference Resolution is an essential task for Natural Language Processing (NLP)
application, which has a paramount impact on the performance of text summarization …

Why can't discourse parsing generalize? A thorough investigation of the impact of data diversity

YJ Liu, A Zeldes - arXiv preprint arXiv:2302.06488, 2023 - arxiv.org
Recent advances in discourse parsing performance create the impression that, as in other
NLP tasks, performance for high-resource languages such as English is finally becoming …

Quasi: a synthetic question-answering dataset in Swedish using GPT-3 and zero-shot learning

D Kalpakchi, J Boye - Proceedings of the 24th Nordic Conference …, 2023 - aclanthology.org
This paper describes the creation and evaluation of a synthetic dataset of Swedish multiple-
choice questions (MCQs) for reading comprehension using GPT-3. Although GPT-3 is …

What's Hard in English RST Parsing? Predictive Models for Error Analysis

YJ Liu, T Aoyama, A Zeldes - arXiv preprint arXiv:2309.04940, 2023 - arxiv.org
Despite recent advances in Natural Language Processing (NLP), hierarchical discourse
parsing in the framework of Rhetorical Structure Theory remains challenging, and our …

Aggregating crowdsourced and automatic judgments to scale up a corpus of anaphoric reference for fiction and Wikipedia texts

J Yu, S Paun, M Camilleri, PC Garcia… - arXiv preprint arXiv …, 2022 - arxiv.org
Although several datasets annotated for anaphoric reference/coreference exist, even the
largest such datasets have limitations in terms of size, range of domains, coverage of …

Developing a multilayer semantic annotation scheme based on ISO standards for the visualization of a newswire corpus

P Silvano, A Leal, F Silva, I Cantante… - Proceedings of the …, 2021 - aclanthology.org
In this paper, we describe the process of developing a multilayer semantic annotation
scheme designed for extracting information from a European Portuguese corpus of news …

Exploring a Multi-Layered Cross-Genre Corpus of Document-Level Semantic Relations

G Williamson, A Cao, Y Chen, Y Ji, L Xu, JD Choi - Information, 2023 - mdpi.com
This paper introduces a multi-layered cross-genre corpus, annotated for coreference
resolution, causal relations, and temporal relations, comprising a variety of genres, from …

eRST: A Signaled Graph Theory of Discourse Relations and Organization

A Zeldes, T Aoyama, YJ Liu, S Peng, D Das… - Computational …, 2024 - direct.mit.edu
In this article we present Enhanced Rhetorical Structure Theory (eRST), a new theoretical
framework for computational discourse analysis, based on an expansion of Rhetorical …

GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains

YJ Liu, T Aoyama, W Scivetti, Y Zhu, S Behzad… - arXiv preprint arXiv …, 2024 - arxiv.org
Work on shallow discourse parsing in English has focused on the Wall Street Journal
corpus, the only large-scale dataset for the language in the PDTB framework. However, the …