Directions in abusive language training data, a systematic review: Garbage in, garbage out

B Vidgen, L Derczynski - Plos one, 2020 - journals.plos.org
Data-driven and machine learning based approaches for detecting, categorising and
measuring abusive content such as hate speech and harassment have gained traction due …

The ACL anthology network corpus

DR Radev, P Muthukrishnan, V Qazvinian… - Language Resources …, 2013 - Springer
We introduce the ACL Anthology Network (AAN), a comprehensive manually curated
networked database of citations, collaborations, and summaries in the field of Computational …

[PDF][PDF] Modeling concept dependencies in a scientific corpus

J Gordon, L Zhu, A Galstyan, P Natarajan… - Proceedings of the …, 2016 - aclanthology.org
Our goal is to generate reading lists for students that help them optimally learn technical
material. Existing retrieval algorithms return items directly relevant to a query but do not …

Docparser: Hierarchical document structure parsing from renderings

J Rausch, O Martinez, F Bissig, C Zhang… - Proceedings of the …, 2021 - ojs.aaai.org
Translating renderings (eg PDFs, scans) into hierarchical document structures is extensively
demanded in the daily routines of many real-world applications. However, a holistic …

The ACL Anthology: Current state and future directions

D Gildea, MY Kan, N Madnani… - … of Workshop for NLP …, 2018 - aclanthology.org
Abstract The Association of Computational Linguistic's Anthology is the open source archive,
and the main source for computational linguistics and natural language processing's …

[PDF][PDF] Tokenization: Returning to a long solved problem—a survey, contrastive experiment, recommendations, and toolkit—

R Dridan, S Oepen - Proceedings of the 50th Annual Meeting of …, 2012 - aclanthology.org
We examine some of the frequently disregarded subtleties of tokenization in Penn Treebank
style, and present a new rule-based preprocessing toolkit that not only reproduces the …

DSG: An End-to-End Document Structure Generator

J Rausch, G Rashiti, M Gusev, C Zhang… - … Conference on Data …, 2023 - ieeexplore.ieee.org
Information in industry, research, and the public sector is widely stored as rendered
documents (eg, PDF files, scans). Hence, to enable downstream tasks, systems are needed …

[PDF][PDF] 基于话题模型的科技文献话题发现和趋势分析

贺亮, 李芳 - 中文信息学报, 2012 - cs.sjtu.edu.cn
(Dept. of Computer Science & Engineering, Shanghai Jiao Tong University, Shanghai
200240, China) Abstract: Automatically extracting topics from scientific literature and finding …

[PDF][PDF] Computational linguistics and grammar engineering

EM Bender, G Emerson - Head-Driven Phrase Structure Grammar …, 2021 - library.oapen.org
We discuss the relevance of HPSG for computational linguistics, and the relevance of
computational linguistics for HPSG, including: the theoretical and computational …

NLPExplorer: Exploring the Universe of NLP Papers

M Parmar, N Jain, P Jain, P Jayakrishna Sahit… - Advances in Information …, 2020 - Springer
Understanding the current research trends, problems, and their innovative solutions remains
a bottleneck due to the ever-increasing volume of scientific articles. In this paper, we …