X Ma, B Kang, Z Xu, M Lin… - Advances in Neural …, 2024 - proceedings.neurips.cc
The major challenge of offline RL is the distribution shift that appears when out-of- distribution actions are queried, which makes the policy improvement direction biased by …
Numerous capability and safety techniques of Large Language Models (LLMs), including RLHF, automated red-teaming, prompt engineering, and infilling, can be cast as sampling …
Denoising diffusion models enable conditional generation and density modeling of complex relationships like images and text. However, the nature of the learned relationships is …
O Chehab, A Hyvarinen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Recent research has developed several Monte Carlo methods for estimating the normalization constant (partition function) based on the idea of annealing. This means …
Abstract Successful applications of InfoNCE (Information Noise-Contrastive Estimation) and its variants have popularized the use of contrastive variational mutual information (MI) …
Abstract Markov chain Monte Carlo methods for sampling from complex distributions and estimating normalization constants often simulate samples from a sequence of intermediate …
Y Zhou, Y Han, H Zhuang, H Bao… - Proceedings of The Forty …, 2024 - inria.hal.science
Research on adversarial robustness has predominantly focused on continuous inputs, leaving categorical inputs, especially tabular attributes, less examined. To echo this …
Markov Chain Monte Carlo methods for sampling from complex distributions and estimating normalization constants often simulate samples from a sequence of intermediate …
Self-supervised learning is an increasingly popular approach to unsupervised learning, achieving state-of-the-art results. A prevalent approach consists in contrasting data points …