Overview of the authorship verification task at PAN 2022

E Stamatatos, M Kestemont, K Kredens… - CEUR workshop …, 2022 - research.aston.ac.uk
The authorship verification task at PAN 2022 follows the experimental setup of similar
shared tasks in the recent past. However, it focuses on a different, and very challenging …

Authorship attribution in the era of llms: Problems, methodologies, and challenges

B Huang, C Chen, K Shu - arXiv preprint arXiv:2408.08946, 2024 - arxiv.org
Accurate attribution of authorship is crucial for maintaining the integrity of digital content,
improving forensic investigations, and mitigating the risks of misinformation and plagiarism …

[HTML][HTML] Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers

M Siino, I Tinnirello, M La Cascia - Information Systems, 2024 - Elsevier
With the advent of the modern pre-trained Transformers, the text preprocessing has started
to be neglected and not specifically addressed in recent NLP literature. However, both from …

Sok: Memorization in general-purpose large language models

V Hartmann, A Suri, V Bindschaedler, D Evans… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) are advancing at a remarkable pace, with myriad
applications under development. Unlike most earlier machine learning models, they are no …

[PDF][PDF] Improving Irony and Stereotype Spreaders Detection using Data Augmentation and Convolutional Neural Network.

S Mangione, M Siino, G Garbo - CLEF (Working Notes), 2022 - pan.webis.de
In this paper we describe a deep learning model based on a Data Augmentation (DA) layer
followed by a Convolutional Neural Network (CNN). The proposed model was developed by …

[PDF][PDF] T100: A modern classic ensemble to profile irony and stereotype spreaders.

M Siino, I Tinnirello, M La Cascia - CLEF (Working Notes), 2022 - downloads.webis.de
In this work we propose a novel ensemble model based on deep learning and non-deep
learning classifiers. The proposed model was developed by our team for participating at the …

Digital authorship attribution in russian-language fanfiction and classical literature

A Fedotova, A Romanov, A Kurtukova, A Shelupanov - Algorithms, 2022 - mdpi.com
This article is the third paper in a series aimed at the establishment of the authorship of
Russian-language texts. This paper considers methods for determining the authorship of …

[PDF][PDF] An SVM Ensemble Approach to Detect Irony and Stereotype Spreaders on Twitter.

D Croce, D Garlisi, M Siino - CLEF (Working Notes), 2022 - ceur-ws.org
The problem we address in this work is classifying whether a Twitter user has spread Irony
and Stereotype or not. We used a text vectorization layer to generate Bag-Of-Words …

[PDF][PDF] Ensemble Pre-trained Transformer Models for Writing Style Change Detection.

TM Lin, CY Chen, YW Tzeng, LH Lee - CLEF (Working Notes), 2022 - ceur-ws.org
This paper describes a proposed system design for Style Change Detection (SCD) tasks for
PAN at CLEF 2022. We propose a unified architecture of ensemble neural networks to solve …

[PDF][PDF] Graph-based siamese network for authorship verification

JA Martinez-Galicia… - CLEF 2022 Labs …, 2022 - downloads.webis.de
Authorship verification is the task of determining whether or not the same author wrote two
texts based on comparing the texts. The PAN@ CLEF 2022 [1] Authorship Verification …