Unveiling Safety Vulnerabilities of Large Language Models

S Achintalwar, AA Garcia, A Anaby-Tavor… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output
to biased and toxic generations. Due to several limiting factors surrounding LLMs (training …

被引用次数：14 相关文章所有 3 个版本

[PDF] aaai.org

Decolonial AI Alignment: Openness, Viśesa-Dharma, and Including Excluded Knowledges

KR Varshney - Proceedings of the AAAI/ACM Conference on AI, Ethics …, 2024 - ojs.aaai.org

Prior work has explicated the coloniality of artificial intelligence (AI) development and
deployment through mechanisms such as extractivism, automation, sociological …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Alignment studio: Aligning large language models to particular contextual regulations

S Achintalwar, I Baldini, D Bouneffouf… - IEEE Internet …, 2024 - ieeexplore.ieee.org

The alignment of large language models is usually done by model providers to add or
control behaviors that are common or universally understood across use cases and …

被引用次数：3 相关文章所有 3 个版本

DARE to Diversify: DAta Driven and Diverse LLM REd Teaming

M Nagireddy, B Guillén Pegueroles… - Proceedings of the 30th …, 2024 - dl.acm.org

Large language models (LLMs) have been rapidly adopted, as showcased by ChatGPT's
overnight popularity, and are integrated in products used by millions of people every day …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Arabic Dataset for LLM Safeguard Evaluation

Y Ashraf, Y Wang, B Gu, P Nakov, T Baldwin - arXiv preprint arXiv …, 2024 - arxiv.org

The growing use of large language models (LLMs) has raised concerns regarding their
safety. While many studies have focused on English, the safety of LLMs in Arabic, with its …

[PDF][PDF] Word alignment in Discourse Representation Structure parsing

C Obereder, G Recski - Proceedings of the 20th Conference on …, 2024 - aclanthology.org

Abstract Discourse Representation Structures (DRS) are formal representations of linguistic
semantics based on Discourse Representation Theory (DRT, Kamp et al., 2011) that …

高级搜索

QQ 群