TY - JOUR
T1 - A comparative study of fairness-enhancing interventions in machine learning
AU - Friedler, Sorelle A.
AU - Scheidegger, Carlos
AU - Venkatasubramanian, Suresh
AU - Choudhary, Sonam
AU - Hamilton, Evan P.
AU - Roth, Derek
N1 - Publisher Copyright:
Copyright © 2018, The Authors. All rights reserved.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2018/2/12
Y1 - 2018/2/12
N2 - Computers are increasingly used to make decisions that have significant impact in people's lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers and predictors have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions. Concretely, we present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures, and a large number of existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits), indicating that fairness interventions might be more brittle than previously thought.
AB - Computers are increasingly used to make decisions that have significant impact in people's lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers and predictors have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions. Concretely, we present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures, and a large number of existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits), indicating that fairness interventions might be more brittle than previously thought.
UR - http://www.scopus.com/inward/record.url?scp=85095633950&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095633950&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85095633950
JO - Nuclear Physics A
JF - Nuclear Physics A
SN - 0375-9474
ER -