A survey of reinforcement learning from human feedback

T Kaufmann, P Weng, V Bengs… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning
(RL) that learns from human feedback instead of relying on an engineered reward function …

A review on instance ranking problems in statistical learning

T Werner - Machine Learning, 2022 - Springer
Ranking problems, also known as preference learning problems, define a widely spread
class of statistical learning problems with many applications, including fraud detection …

What are the best systems? new perspectives on nlp benchmarking

P Colombo, N Noiry, E Irurozki… - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract In Machine Learning, a benchmark refers to an ensemble of datasets associated
with one or multiple metrics together with a way to aggregate different systems …

[PDF][PDF] Rank aggregation algorithms for fair consensus

C Kuhlman, E Rundensteiner - Proceedings of the VLDB Endowment, 2020 - par.nsf.gov
Aggregating multiple rankings in a database is an important task well studied by the
database community. High-stakes application domains include hiring, lending, and …

Unsupervised model selection for time-series anomaly detection

M Goswami, C Challu, L Callot, L Minorics… - arXiv preprint arXiv …, 2022 - arxiv.org
Anomaly detection in time-series has a wide range of practical applications. While numerous
anomaly detection methods have been proposed in the literature, a recent survey concluded …

A tale of hodgerank and spectral method: Target attack against rank aggregation is the fixed point of adversarial game

K Ma, Q Xu, J Zeng, G Li, X Cao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Rank aggregation with pairwise comparisons has shown promising results in elections,
sports competitions, recommendations, and information retrieval. However, little attention …

Measuring and controlling divisiveness in rank aggregation

R Colley, U Grandi, C Hidalgo, M Macedo… - arXiv preprint arXiv …, 2023 - arxiv.org
In rank aggregation, members of a population rank issues to decide which are collectively
preferred. We focus instead on identifying divisive issues that express disagreements …

Concentric mixtures of Mallows models for top- rankings: sampling and identifiability

F Collas, E Irurozki - International Conference on Machine …, 2021 - proceedings.mlr.press
In this paper, we study mixtures of two Mallows models for top-$ k $ rankings with equal
location parameters but with different scale parameters (a mixture of concentric Mallows …

Linear label ranking with bounded noise

D Fotakis, A Kalavasis, V Kontonis… - Advances in Neural …, 2022 - proceedings.neurips.cc
Label Ranking (LR) is the supervised task of learning a sorting function that maps feature
vectors $ x\in\mathbb {R}^ d $ to rankings $\sigma (x)\in\mathbb S_k $ over a finite set of $ k …

Poisoning attack against estimating from pairwise comparisons

K Ma, Q Xu, J Zeng, X Cao… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
As pairwise ranking becomes broadly employed for elections, sports competitions,
recommendation, information retrieval and so on, attackers have strong motivation and …