Bias Research Bingo
Statistical fairness criteria, explainability, and fighting the (mathematical) law
U.S. National Science Foundation (NSF) is now scanning research projects for words like biases, inequality, and women — to potentially defund them under recent Presidential orders (1, 2). Having defended my own NSF-supported dissertation research on biases over a decade ago, I’m torn between feeling solidarity — and being afraid to say what I really think (not that it’s ever stopped me before)…
This could be a great opportunity for science to come to terms with its serious methodological and ethical problems. Bias research is a mess. Much of it fails to incorporate insights from the causal revolution.
In addition, statistical fairness criteria are mathematically constrained; certain trade-offs are generally unavoidable. This can create a no-win situation (or all-win, depending on your perspective) in which researchers can probably find some form of bias in any given dataset. This is one manifestation of the larger garden of forking paths problem in multiple comparisons with researcher degrees of freedom.
Yet, the narrative that bias is pervasive has become, well, pervasive — and, inevitably, can become its own bias. At the same time, many relevant findings (e.g., priming, implicit associations, stereotype threat) have failed to replicate or predict anything important. Sometimes, bad behavior underlies replication failures (e.g., in the case of fraud). Sometimes not.
Beyond the inevitable limits of perspective, beyond the perennial persistence of uncertainty, and beyond the universal fact of human stupidity lies a simple mathematical truth: bias can mean a lot of different things. We generally can’t avoid all of them at once. Nor is it clear that we would want to, if we knew what we were compromising in return…
Would you prefer your digitized future to be more accurate, or more equal?
Enter Artificial Intelligence/Machine Learning
Ascriptive algorithmic biases are a common design flaw of increasingly pervasive technology-mediated decisions. From military and criminal justice contexts to medicine, welfare, housing, finance, and education, AI/ML’s promise (improved accuracy) conflicts with its effects (decreased fairness).
Take these recent examples, drawn from top Google News search results for “bias algorithms”: “Bias in Code: Algorithm Discrimination in Financial Systems,” (RFK Human Rights, Spencer Wang, Jan. 25, 2025); “AI threatens to cement racial bias in clinical algorithms. Could it also chart a path forward?” (STAT News, Katie Palmer, Sept. 11, 2024); “Algorithmic Bias Continues to Negatively Impact Minoritized Students” (Diverse: Issues In Higher Education, Liann Herder, July 15, 2024). This concern isn’t new: In a particularly well-known 2016 exposé, ProPublica’s analysis of the recidivism risk assessment tool COMPAS (“Correctional Offender Management Profiling for Alternative Sanctions”) found it had a higher false positive rate for blacks and false negative rate for whites.
So biased are we fallible human beings that we can’t help creating biased technologies — or running with their biased outcomes. If only we stopped using these biased technologies, or only used them with a “human in the loop,” or with “bias mitigation strategies” — then outcomes would improve.
At least, that’s one preferred narrative of powerful sociopolitical networks in mainstream and scientific discourse alike. But what if the data are more exegetical than that? The implicated concepts and outcomes more complex? What if the problem is not a binary one — bias, yes or no? But rather a more nuanced matter of what sorts of outcomes we want to prioritize in an imperfect world of crappy trade-offs? And what we mean by bias given a complex world full of unknowns, inequalities, variations, and other uncontrollable complications that keep us from knowing many a salient ground truth?
That would be a harder story to tell in science and its translation. It might be harder to get funded and published. It might create social and professional tensions with people and networks who believe they already know the question (bias, yes or no?) and its answer (yes).
No matter if that’s the wrong question. So scientists and science communicators might be less likely to tell it, and instead ignore its messenger. (Or maybe they just haven’t gotten the memo yet? I understand there are a lot of memos going around.)
Such is the lot of Brian Hedden’s 2021 “On Statistical Criteria of Algorithmic Fairness” (Philosophy and Public Affairs, 49 (2):209-231; winner of the inaugural APA AI2050 Early Career Researcher Prize) and the rest of the recent possibility/implausibility literature it critically synthesizes (h/t Will Lowe for this and other related readings and insights). Hedden is a philosophy professor at Australian National University.
Hedden on Fairness
In the COMPAS case, Angwin et al writing for “ProPublica focused on one set of fairness criteria (equal false positive and false negative rates)” to argue that COMPAS is biased against blacks. Meanwhile, the company that created the tool, Northpointe, argued “the algorithm is not biased, in part because its predictions were equally accurate for the two groups (Dietrich et al. 2016).”
What do we mean by “bias”?
Hedden points out Northpointe and Flores et al. (who also countered ProPublica in a 2016 publication) focused on different fairness criteria (equal predictive accuracy and calibration). Calibration here refers to whether the percentage of blacks and whites who reoffended given a particular risk score was about the same. (Pure Bayesians ignore calibration; in a conceptual universe based on counting within subgroups, it’s redundant.)
You can’t have it all
This example illustrates the implications of a series of so-called impossibility theorems. A brief note on terminology: Hedden suggests they should be known as triviality results, since they show co-satisfaction of various fairness criteria are not impossible, but rather, only possible in marginal or trivial cases. It seems to me that incompatibility would be clearer, since marginal also has normative connotations (e.g., in characterizing marginalized groups) and trivial similarly has other uses (e.g., in characterizing concerns as unimportant). (I’m an incompatibility expert; ask me about my romantic life.)
Anyway, in the impossibility theorem arena, Kleinberg et al 2016 (“Inherent Trade-Offs in the Fair Determination of Risk Scores”), Chouldechova 2017 (“Fair prediction with disparate impact: A study of bias in recidivism prediction instruments”), and Miconi 2017 (“The impossibility of ‘fairness’: a generalized impossibility result for decisions”) prove no algorithm can jointly satisfy various combinations of different statistical fairness criteria (p. 9-10). Hedden defines 11 different ways of formulating such criteria, including some that had not yet been included in work on impossibility (p. 6), observes unequal base rates would generally make the new criteria incompatible too (p. 10), and suggests these proofs all show not that fairness dilemmas are inevitable — but, rather, that not all of these statistical criteria are necessary conditions of fairness.
Hedden then shows a fair algorithm can violate all the criteria except Calibration Within Groups (p. 11-12). It can even violate the other ten criteria simultaneously and under some conditions with equal base rates (p. 12). The other criteria are: for continuous risks scores, balance for the positive and negative classes respectively across relevant subgroups; and for binary predictions, equal false positive and negative rates respectively, equal positive and negative predictive value respectively, equal ratios of false positive to negative rates, equal overall error rates, statistical parity (aka demographic parity, a commonly rejected notion since it assumes equal base rates), and equal ratios of predicted to actual positives. (The inclusion of the binary predictions class here has implications for my hobbyhorse, mass screenings for low-prevalence problems.)
Hedden argues his example algorithm is perfectly fair (footnote 19, p. 13-4), so these other notions of statistical fairness are not necessary conditions of fairness. He recognizes the real world is a lot more complicated than his idealized example (p. 15). Yet, the reason his proposed fair algorithm “violated all of our criteria except Calibration Within Groups… is because the people from [one subgroup] were all relatively ‘clear’ or non-marginal cases, while those from [another] were all relatively unclear or marginal cases,” resulting in more mistaken classifications for the subgroup of relatively unclear cases (p. 16).
Systematic subgroup variation in such ambiguity is a plausible real-world fairness problem, as in the case of possible racial subgroup differences in skin conductance allegedly causing more inconclusive polygraph tests among blacks (more on this in a future post).
Having disproven the necessity of competing statistical fairness criteria, Hedden also offers a positive argument in favor of the last one standing: Violation of Calibration Within Groups “would mean that the same risk score would have different evidential import for the two groups” so that “A given risk score, intended to be interpreted probabilistically, would in fact correspond to a different probability of being positive, depending on the individual’s group membership” (p. 16-17).
Hedden recognizes that “how we should actually design predictive algorithms depends on more than just the fairness of the algorithm itself.” (p. 22) Real-world empirics are more complex than his idealized example, predictive algorithms can be used in different practical ways and aren’t the only point at which we can intervene for fairness (though intervening at many other points may be even less sociopolitically feasible), and structural inequalities may drive genuine subgroup differences that — if taken in isolation— then compound social inequities. Algorithms may also exacerbate stereotypes without being inaccurate or unfair, and this exacerbation may have unintended adverse consequences for affected groups. There is also a whole discourse on disparate impacts and whether they constitute discrimination (many legal systems say yes, but Eidelson 2015 and others see them as redistributive programs; citing p. 39 on p. 20).
Overall, Hedden recognizes that algorithm design implicates ethical as well as empirical questions well beyond statistical fairness criteria. But his synthesis shows that we cannot answer these questions with the sorts of analyses that researchers and journalists often undertake to do so. Questions with keywords like biases, inequality, and women.
This looks like a different path to the same take-home point of the causal revolution for bias research specifically and most of science in general: its foundation is bad and it must be rebuilt from the ground up. Yet, Hedden doesn’t address causality in that sense here (only in a brief discussion on p. 4-5). Why?
Loftus and Bryson on Explainability
Joshua Loftus might respond that it doesn’t matter why humanity is stuck serving tech (like predictive algorithms) and ideas (like statistical fairness as calibration within groups) instead of the other way around. The point is that we should change that. Loftus is a London School of Economics assistant professor in statistics and affiliate of the LSE Data Science Institute.
Because human beings care about causes (not correlations), the AI/ML community should go ahead and imperfectly incorporate causal modeling knowledge to improve quality, advancing the scientific evidentiary foundation of increasingly widespread tech. Doing this now rather than later would be better even though, yes, the field is moving fast and imperfect people in an imperfect world will keep doing imperfect work.
This has implications for explainability: causal explanation is the obvious way to go about remaking “unexplainable” ML models, explainable, using the best scientific method at hand. It requires only building different models using different methods for different purposes.
Ok, so it doesn’t make unexplainable models explainable, per se. Instead it seeks to make algorithms fairer by looking at causes of unfairness instead of evaluating whether outcomes of black-box processes are fair by some contested criteria (or how someone got into secondary screening, or how they got in the attributes that got them there, or something else). In other words, we did it wrong and have to do it all over again. (Also bringing the causal revolution to AI/ML: Kilbertus et al’s 2017 “Avoiding discrimination through causal reasoning” and Elias Bareinboim and Drago Plecko’s 2022 “Causal Fairness Analysis.”)
This is exactly the direction you would expect the literature to take from the contemporary methods corner. As often, ethical concerns pull in the same direction…
Hertie School Professor of Ethics and Technology Joanna Bryson argues that AI can always be built to be explainable, and only humans can be held accountable anyway. So explainability is a bit of a red herring: the focus should be on how to get stuff done that we care about for reasons of first principles, first. Not on making some arbitrarily higher standard for explainability in AI/ML just because tech scares people.
Like Loftus, part of Bryson’s focus is on combatting perfectionism in favor of making the world better now (or trying): we don’t need perfect or complete explanations to get stuff done, she notes. And maybe explainability in AI/ML should mean something less than that, too. Something like sufficient explanation to improve society, including helping people understand what the government is doing and why. And letting people who know about how the system works make informed decisions about whether and how to use it.
Again consistent with Loftus, Bryson goes farther than that to refocus our attention on humans and not tech as agents. If we conceptualize tech as cultural artifact instead of agent, Bryson’s longtime thesis, then whether AI/ML is explainable or not is moot. It’s always people who are designing tech, using it for particular ends in particular contexts, and evaluating those uses. People can’t explain ourselves perfectly or completely, either. We don’t have to.
Fighting the (Mathematical) Law
So what does all this have to do with the accuracy-error trade-off and whether AI/ML can break the mathematical law? Accuracy and error are statistical outcomes, while fairness is a social one. So the former are constrained by definition in a world of probabilistic cues. AI/ML, especially with human expert assistance (complementarity), can be designed to interpret those cues better than people alone in terms of picking out signal from noise — increasing accuracy without incurring an error cost (holding false positives or negatives constant). But, if that’s the result of predictive rather than causal modeling, it might not produce results that are either replicable or explainable…
In which case, some people might argue it doesn’t do us demonstrated net good versus harm. But we might not know that if we don’t have a way of verifying ground truth to know where results stop generalizing or making causal sense. Therein lies the paramount linkage between methods and ethics here: without a strong evidentiary basis grounded in causal reasoning, we risk deploying tech that reproduces historical inequities without making decisions that are verifiably right or wrong, or otherwise taking actions that neither achieve priority ends, nor enact priority principles.
But if all we want to do is achieve agreed ends better and make decisions comparatively explainable, then why isn’t predictive AI/ML à la COMPAS good enough? If it increases accuracy — and makes more explainable decisions than human judges to boot — then why not call that progress? Are allegations of bias in technology sometimes no more than well-intentioned misinterpretations of the implications of poorly designed analyses?
Bayes’ means trades
To answer this question with more than a simple “yes” (which would suffice), we go back to Bayes’ rule to see how recent impossibility theorems and the age-old accuracy-error trade-off relate at their shared conceptual root… Recall Bayes’ theorem is a component of the universal mathematical laws of probability theory that describes how the probability of an event changes depending on the subgroup. It implies different base rates across subgroups generate different false positive and negative rates.
This means you can’t equalize false positives, false negatives, and calibration across subgroups, unless you artificially force base rates to be equal. Decreasing one (false positives/negatives) increases another. The impossibility theorems emerge from this constraint. This is why only one type of error can be held constant while accuracy is improved in recent AI/ML research. We should then be alert to what happens to the other type of error.
In other words, satisfying multiple fairness criteria simultaneously is generally impossible because it would require violating Bayes’ rule. And, when you fight the (mathematical) law, the law wins.
Conclusion
Bias research is important. But bias can mean lots of different things. In evaluating algorithmic decision-making, selecting statistical fairness criteria is a complex sociopolitical choice that researchers should do openly and with more awareness of its complexities than is the current norm.
As often, much of the scientific and popular discourse has not yet caught up with its upper-echelon methodological and ethical moorings. That’s not odd or bad. It just means science as a culture has room to grow and sunlight to grow towards, like any other culture.
But one thing will never change: no one is above the (mathematical) law.