Don't throw it away! the utility of unlabeled data in fair decision making

M Rateike, A Majumdar, O Mineeva… - Proceedings of the …, 2022 - dl.acm.org
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022dl.acm.org
Decision making algorithms, in practice, are often trained on data that exhibits a variety of
biases. Decision-makers often aim to take decisions based on some ground-truth target that
is assumed or expected to be unbiased, ie, equally distributed across socially salient
groups. In many practical settings, the ground-truth cannot be directly observed, and instead,
we have to rely on a biased proxy measure of the ground-truth, ie, biased labels, in the data.
In addition, data is often selectively labeled, ie, even the biased labels are only observed for …
Decision making algorithms, in practice, are often trained on data that exhibits a variety of biases. Decision-makers often aim to take decisions based on some ground-truth target that is assumed or expected to be unbiased, i.e., equally distributed across socially salient groups. In many practical settings, the ground-truth cannot be directly observed, and instead, we have to rely on a biased proxy measure of the ground-truth, i.e., biased labels, in the data. In addition, data is often selectively labeled, i.e., even the biased labels are only observed for a small fraction of the data that received a positive decision. To overcome label and selection biases, recent work proposes to learn stochastic, exploring decision policies via i) online training of new policies at each time-step and ii) enforcing fairness as a constraint on performance. However, the existing approach uses only labeled data, disregarding a large amount of unlabeled data, and thereby suffers from high instability and variance in the learned decision policies at different times. In this paper, we propose a novel method based on a variational autoencoder for practical fair decision-making. Our method learns an unbiased data representation leveraging both labeled and unlabeled data and uses the representations to learn a policy in an online process. Using synthetic data, we empirically validate that our method converges to the optimal (fair) policy according to the ground-truth with low variance. In real-world experiments, we further show that our training approach not only offers a more stable learning process but also yields policies with higher fairness as well as utility than previous approaches.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
搜索
获取 PDF 文件
引用
References