view, we adopt a least square surrogate loss approach that solves a supervised learning
problem in two steps: a regression step in a well-chosen feature space and a pre-image (or
decoding) step. We use specific feature maps/embeddings for ranking data, which convert
any ranking/permutation into a vector representation. These embeddings are all well-
tailored for our approach, either by resulting in consistent estimators, or by solving trivially …