The robustness of neural networks against input perturbations with bounded magnitude represents a serious concern in the deployment of deep learning models in safety-critical …
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails. In many domains, adversarial training has proven to be one of the most …
Current research in adversarial robustness of LLMs focuses on discrete input manipulations in the natural language space, which can be directly transferred to closed-source models …
In mobile robotics, perception in uncontrolled environments like autonomous driving is a central hurdle. Existing active learning frameworks can help enhance perception by …
R Mangal, K Leino, Z Wang, K Hu, W Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Over the years, researchers have developed myriad attacks that exploit the ubiquity of adversarial examples, as well as defenses that aim to guard against the security …
Their vulnerability to small, imperceptible attacks limits the adoption of deep learning models to real-world systems. Adversarial training has proven to be one of the most promising …
Despite extensive research since the community learned about adversarial examples 10 years ago, we still do not know how to train high-accuracy classifiers that are guaranteed to …
The rapid growth and widespread reliance on machine learning (ML) systems across critical applications such as healthcare, autonomous driving, and cybersecurity have un-derscored …