Machine-generated text: A comprehensive survey of threat models and detection methods

EN Crothers, N Japkowicz, HL Viktor - IEEE Access, 2023 - ieeexplore.ieee.org
Machine-generated text is increasingly difficult to distinguish from text authored by humans.
Powerful open-source models are freely available, and user-friendly tools that democratize …

Attribution and obfuscation of neural text authorship: A data mining perspective

A Uchendu, T Le, D Lee - ACM SIGKDD Explorations Newsletter, 2023 - dl.acm.org
Two interlocking research questions of growing interest and importance in privacy research
are Authorship Attribution (AA) and Authorship Obfuscation (AO). Given an artifact …

M4: Multi-generator, multi-domain, and multi-lingual black-box machine-generated text detection

Y Wang, J Mansurov, P Ivanov, J Su… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capability to generate fluent
responses to a wide variety of user queries. However, this has also raised concerns about …

Overview of autextification at iberlef 2023: Detection and attribution of machine-generated text in multiple domains

AM Sarvazyan, JÁ González… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper presents the overview of the AuTexTification shared task as part of the IberLEF
2023 Workshop in Iberian Languages Evaluation Forum, within the framework of the SEPLN …

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

D Macko, R Moro, A Uchendu, JS Lucas… - arXiv preprint arXiv …, 2023 - arxiv.org
There is a lack of research into capabilities of recent LLMs to generate convincing text in
languages other than English and into performance of detectors of machine-generated text …

Badrock at semeval-2024 task 8: Distilbert to detect multigenerator, multidomain and multilingual black-box machine-generated text

M Siino - Proceedings of the 18th International Workshop on …, 2024 - aclanthology.org
Abstract The rise of Large Language Models (LLMs) has brought about a notable shift,
rendering them increasingly ubiquitous and readily accessible. This accessibility has …

Authorship attribution in the era of llms: Problems, methodologies, and challenges

B Huang, C Chen, K Shu - arXiv preprint arXiv:2408.08946, 2024 - arxiv.org
Accurate attribution of authorship is crucial for maintaining the integrity of digital content,
improving forensic investigations, and mitigating the risks of misinformation and plagiarism …

Detecting generated scientific papers using an ensemble of transformer models

A Glazkova, M Glazkov - arXiv preprint arXiv:2209.08283, 2022 - arxiv.org
The paper describes neural models developed for the DAGPap22 shared task hosted at the
Third Workshop on Scholarly Document Processing. This shared task targets the automatic …

Stacking the odds: Transformer-based ensemble for ai-generated text detection

D Nguyen, KMN Naing, A Joshi - arXiv preprint arXiv:2310.18906, 2023 - arxiv.org
This paper reports our submission under the team nameSynthDetectives' to the ALTA 2023
Shared Task. We use a stacking ensemble of Transformers for the task of AI-generated text …

Automatic detection of machine generated texts: Need more tokens

G Gritsay, A Grabovoy… - 2022 Ivannikov Memorial …, 2022 - ieeexplore.ieee.org
Current advances in text generation using neural approaches make it possible to create
texts hardly distinguishable from human texts. A survey to improve the efficiency of automatic …