Detectors for safe and reliable llms: Implementations, uses, and limitations

S Achintalwar, AA Garcia, A Anaby-Tavor… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output
to biased and toxic generations. Due to several limiting factors surrounding LLMs (training …

Decolonial AI Alignment: Openness, Viśesa-Dharma, and Including Excluded Knowledges

KR Varshney - Proceedings of the AAAI/ACM Conference on AI, Ethics …, 2024 - ojs.aaai.org
Prior work has explicated the coloniality of artificial intelligence (AI) development and
deployment through mechanisms such as extractivism, automation, sociological …

Alignment studio: Aligning large language models to particular contextual regulations

S Achintalwar, I Baldini, D Bouneffouf… - IEEE Internet …, 2024 - ieeexplore.ieee.org
The alignment of large language models is usually done by model providers to add or
control behaviors that are common or universally understood across use cases and …

DARE to Diversify: DAta Driven and Diverse LLM REd Teaming

M Nagireddy, B Guillén Pegueroles… - Proceedings of the 30th …, 2024 - dl.acm.org
Large language models (LLMs) have been rapidly adopted, as showcased by ChatGPT's
overnight popularity, and are integrated in products used by millions of people every day …

Arabic Dataset for LLM Safeguard Evaluation

Y Ashraf, Y Wang, B Gu, P Nakov, T Baldwin - arXiv preprint arXiv …, 2024 - arxiv.org
The growing use of large language models (LLMs) has raised concerns regarding their
safety. While many studies have focused on English, the safety of LLMs in Arabic, with its …

[PDF][PDF] Word alignment in Discourse Representation Structure parsing

C Obereder, G Recski - Proceedings of the 20th Conference on …, 2024 - aclanthology.org
Abstract Discourse Representation Structures (DRS) are formal representations of linguistic
semantics based on Discourse Representation Theory (DRT, Kamp et al., 2011) that …