Versatile dueling bandits: Best-of-both world analyses for learning from relative preferences

A Saha, P Gaillard - International Conference on Machine …, 2022 - proceedings.mlr.press
We study the problem of $ K $-armed dueling bandit for both stochastic and adversarial
environments, where the goal of the learner is to aggregate information through relative …

Preference modeling with context-dependent salient features

A Bower, L Balzano - International Conference on Machine …, 2020 - proceedings.mlr.press
We consider the problem of estimating a ranking on a set of items from noisy pairwise
comparisons given item features. We address the fact that pairwise comparison data often …

Versatile dueling bandits: Best-of-both-world analyses for online learning from preferences

A Saha, P Gaillard - arXiv preprint arXiv:2202.06694, 2022 - arxiv.org
We study the problem of $ K $-armed dueling bandit for both stochastic and adversarial
environments, where the goal of the learner is to aggregate information through relative …

Fast and accurate ranking regression

I Yildiz, J Dy, D Erdogmus… - International …, 2020 - proceedings.mlr.press
We consider a ranking regression problem in which we use a dataset of ranked choices to
learn Plackett-Luce scores as functions of sample features. We solve the maximum …

Spectral ranking with covariates

SL Chau, M Cucuringu, D Sejdinovic - Joint European Conference on …, 2022 - Springer
We consider spectral approaches to the problem of ranking n players given their incomplete
and noisy pairwise comparisons, but revisit this classical problem in light of player covariate …

Quadratic metric elicitation for fairness and beyond

G Hiranandani, J Mathur… - Uncertainty in …, 2022 - proceedings.mlr.press
Metric elicitation is a recent framework for eliciting classification performance metrics that
best reflect implicit user preferences based on the task and context. However, available …

A graph theoretic approach for preference learning with feature information

A Saha, A Rajkumar - The 40th Conference on Uncertainty in …, 2024 - openreview.net
We consider the problem of ranking a set of $ n $ items given a sample of their pairwise
preferences. It is well known from the classical results of sorting literature that without any …

CURATRON: Complete Robust Preference Data for Robust Alignment of Large Language Models

ST Nguyen, NU Naresh, T Tulabandhula - arXiv preprint arXiv:2403.02745, 2024 - arxiv.org
This paper addresses the challenges of aligning large language models (LLMs) with human
values via preference learning (PL), with a focus on the issues of incomplete and corrupted …

Sample complexity of rank regression using pairwise comparisons

B Kadıoğlu, P Tian, J Dy, D Erdoğmuş, S Ioannidis - Pattern Recognition, 2022 - Elsevier
We consider a rank regression setting, in which a dataset of N samples with features in R d
is ranked by an oracle via M pairwise comparisons. Specifically, there exists a latent total …

Ranking with features: Algorithm and a graph theoretic analysis

A Saha, A Rajkumar - arXiv preprint arXiv:1808.03857, 2018 - arxiv.org
We consider the problem of ranking a set of items from pairwise comparisons in the
presence of features associated with the items. Recent works have established that $ O …