Combating fake news: A survey on identification and mitigation techniques

K Sharma, F Qian, H Jiang, N Ruchansky… - ACM Transactions on …, 2019 - dl.acm.org
The proliferation of fake news on social media has opened up new directions of research for
timely identification and containment of fake news and mitigation of its widespread impact on …

An overview of probabilistic tree transducers for natural language processing

K Knight, J Graehl - … Conference on Intelligent Text Processing and …, 2005 - Springer
Probabilistic finite-state string transducers (FSTs) are extremely popular in natural language
processing, due to powerful generic methods for applying, composing, and learning them …

Same pre-training loss, better downstream: Implicit bias matters for language models

H Liu, SM Xie, Z Li, T Ma - International Conference on …, 2023 - proceedings.mlr.press
Abstract Language modeling on large-scale datasets improves performance of various
downstream tasks. The validation pre-training loss is often used as the evaluation metric for …

Graph-based text representation and matching: A review of the state of the art and future challenges

AH Osman, OM Barukub - IEEE Access, 2020 - ieeexplore.ieee.org
Graph-based text representation is one of the important preprocessing steps in data and text
mining, Natural Language Processing (NLP), and information retrieval approaches. The …

[PDF][PDF] Accurate unlexicalized parsing

D Klein, CD Manning - Proceedings of the 41st annual meeting of …, 2003 - aclanthology.org
We demonstrate that an unlexicalized PCFG can parse much more accurately than
previously shown, by making use of simple, linguistically motivated state splits, which break …

Expectation-based syntactic comprehension

R Levy - Cognition, 2008 - Elsevier
This paper investigates the role of resource allocation as a source of processing difficulty in
human sentence comprehension. The paper proposes a simple information-theoretic …

Improving compositional generalization with latent structure and data augmentation

L Qiu, P Shaw, P Pasupat, PK Nowak, T Linzen… - arXiv preprint arXiv …, 2021 - arxiv.org
Generic unstructured neural networks have been shown to struggle on out-of-distribution
compositional generalization. Compositional data augmentation via example recombination …

[PDF][PDF] Large margin methods for structured and interdependent output variables.

I Tsochantaridis, T Joachims, T Hofmann… - Journal of machine …, 2005 - jmlr.org
Learning general functional dependencies between arbitrary input and output spaces is one
of the key challenges in computational intelligence. While recent progress in machine …

[图书][B] Handbook of natural language processing

N Indurkhya, FJ Damerau - 2010 - taylorfrancis.com
The Handbook of Natural Language Processing, Second Edition presents practical tools
and techniques for implementing natural language processing in computer systems. Along …

Compound probabilistic context-free grammars for grammar induction

Y Kim, C Dyer, AM Rush - arXiv preprint arXiv:1906.10225, 2019 - arxiv.org
We study a formalization of the grammar induction problem that models sentences as being
generated by a compound probabilistic context-free grammar. In contrast to traditional …