related:il-hnBPfjQMJ:scholar.google.com/

Whose Text Is It Anyway? Exploring BigCode, Intellectual Property, and Ethics

M Zahrah Choksi, D Goedicke - arXiv e-prints, 2023 - ui.adsabs.harvard.edu

Intelligent or generative writing tools rely on large language models that recognize,
summarize, translate, and predict content. This position paper probes the copyright interests …

[PDF] arxiv.org

Whose text is it anyway? exploring bigcode, intellectual property, and ethics

MZ Choksi, D Goedicke - arXiv preprint arXiv:2304.02839, 2023 - arxiv.org

Intelligent or generative writing tools rely on large language models that recognize,
summarize, translate, and predict content. This position paper probes the copyright interests …

被引用次数：2 相关文章

[PDF] arxiv.org

SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation

X Liu, T Sun, T Xu, F Wu, C Wang, X Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have transformed machine learning but raised significant
legal concerns due to their potential to produce text that infringes on copyrights, resulting in …

[PDF] arxiv.org

Do language models plagiarize?

J Lee, T Le, J Chen, D Lee - Proceedings of the ACM Web Conference …, 2023 - dl.acm.org

Past literature has illustrated that language models (LMs) often memorize parts of training
instances and reproduce them in natural language generation (NLG) processes. However, it …

被引用次数：58 相关文章所有 7 个版本

[PDF] aclanthology.org

Through the looking glass: Learning to attribute synthetic text generated by language models

S Munir, B Batool, Z Shafiq, P Srinivasan… - Proceedings of the …, 2021 - aclanthology.org

Given the potential misuse of recent advances in synthetic text generation by language
models (LMs), it is important to have the capacity to attribute authorship of synthetic text …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

The (ab) use of open source code to train large language models

A Al-Kaswan, M Izadi - 2023 IEEE/ACM 2nd International …, 2023 - ieeexplore.ieee.org

In recent years, Large Language Models (LLMs) have gained significant popularity due to
their ability to generate human-like text and their potential applications in various fields, such …

被引用次数：13 相关文章所有 9 个版本

[PDF] arxiv.org

Digger: Detecting copyright content mis-usage in large language model training

H Li, G Deng, Y Liu, K Wang, Y Li, T Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Pre-training, which utilizes extensive and varied datasets, is a critical factor in the success of
Large Language Models (LLMs) across numerous applications. However, the detailed …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Ghost Sentence: A Tool for Everyday Users to Copyright Data from Large Language Models

S Zhao, L Zhu, R Quan, Y Yang - arXiv preprint arXiv:2403.15740, 2024 - arxiv.org

Web user data plays a central role in the ecosystem of pre-trained large language models
(LLMs) and their fine-tuned variants. Billions of data are crawled from the web and fed to …

Large language model applications for evaluation: Opportunities and ethical implications

CB Head, P Jasper, M McConnachie… - New directions for …, 2023 - Wiley Online Library

Large language models (LLMs) are a type of generative artificial intelligence (AI) designed
to produce text‐based content. LLMs use deep learning techniques and massively large …

被引用次数：31 相关文章所有 2 个版本

[PDF] arxiv.org

Matching pairs: Attributing fine-tuned models to their pre-trained large language models

M Foley, A Rawat, T Lee, Y Hou, G Picco… - arXiv preprint arXiv …, 2023 - arxiv.org

The wide applicability and adaptability of generative large language models (LLMs) has
enabled their rapid adoption. While the pre-trained models can perform many tasks, such …

被引用次数：2 相关文章所有 6 个版本

高级搜索

QQ 群

Whose Text Is It Anyway? Exploring BigCode, Intellectual Property, and Ethics

Whose text is it anyway? exploring bigcode, intellectual property, and ethics

SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation

Do language models plagiarize?

Through the looking glass: Learning to attribute synthetic text generated by language models

The (ab) use of open source code to train large language models

Digger: Detecting copyright content mis-usage in large language model training

Ghost Sentence: A Tool for Everyday Users to Copyright Data from Large Language Models

Large language model applications for evaluation: Opportunities and ethical implications

Matching pairs: Attributing fine-tuned models to their pre-trained large language models

引用