Stupid Batman, Soul Worms, the Hand of God, and the Tree of Knowledge
Four frames for a cautionary tale of probability theory and risk management
You put your left foot in, you take your left foot out; you put your left foot in, and you shake it all about. Sorry, wrong song and dance. But lately I’ve been casting about, looking for the right one that people know enough that they recognize it, but not enough that they dislike it, to retell a favorite cautionary tale of probability theory and risk management. This is a short post about my results so far.
Usual rant: Mass screenings for low-prevalence problems (MaSLoPP) under conditions of persistent inferential uncertainty may be doomed to backfire due to the inescapable accuracy-error trade-off plus secondary screening harms, endangering those they try to protect. The same mathematical structure characterizes MaSLoPP across diverse domains. I’ve written about this in the context of digital communications scanning programs like Chat Control — aka mass surveillance; several types of AI the European Parliament recently proposed banning; proposed AI “lie detector” iBorderCtrl for foreigners’ non-Schengen border crossings; and Fienberg’s classic polygraph case study. These are all security cases, of which there are many more. There are also many more cases studies of programs that are structured exactly like this in mathematical terms across other arenas that I will say more about in future posts, including:
Medicine:
Mass mammography for early breast cancer detection.
Mass PSA testing for early prostate cancer detection.
Information: Mass screenings that tech companies conduct on their digital platforms for misinformation at the behest of governments.
Education:
Mass student surveillance.
Mass screening of school (including university) papers for evidence of plagiarism and/or AI use.
Mass screening of brain waves to see if students are paying attention, even though this technology can’t do that thing (you could, e.g., be intently focused on something else) and that’s just evil (don’t worry, Chinese parents balked).
MaSLoPP wreaks havoc in society. A sensible society would regulate the class instead of playing Whack-A-Mole with an ever-growing list of new programs with the same old structure. Otherwise, the implications of probability theory may doom costly, systematic efforts to advance the public interest, to undermine it instead.
This is, of course, not *my* rant. It’s Bayes’ theorem: “Our Father, who art in Heaven, hallowed be Thy —” Sorry, wrong cult. The formula is:
Breathe. Now close your eyes and think of the ocean. It wouldn’t make sense to describe a wave (much less a set of waves) as a point, as much published science does in misrepresenting interval estimates as point estimates (different topic). Waves are distributions. So are statistical estimates.
You notice the waves flow in sets, and focus on what seem to be sets of three. The first wave (probability/likelihood) flows by; it’s the way you think something might shake out. Then, a second wave based on experience out there in the ocean (prior) flows through it; it’s the previous information (as little as one observation) you can use to update your likelihood estimate. This creates a third wave (posterior); that’s the updated estimated probability distribution.
It’s hard to intuit what the third wave will look like, knowing the first two; which is at least itself intuitive in the metaphor if you know that fluid mechanics is hard. (See, e.g., “The Scientist’s and Engineer’s Guide to Digital Signal Processing Convolution.”) But Richard McElreath has as usual made a beautiful, clear illustration and explanation of how this works.
Bottom line: Probability theory, like the ocean, follows universal mathematical laws — and yet appears in practice to often have a mind of its own. Also, wouldn’t it be more relaxing to see this when you hear “Bayes’ theorem”?
Credit: Tim Sullivan.
This cult already has some mythologies, many of which I probably don’t know yet. One is Nigel Mathers and Paul Hodgkins’ “The Gatekeeper and the Wizard: a fairy tale,” British Medical Journal 1989; 298 — a plain language Bayes’ theorem fairy tale about the danger of MaSLoPP, how perverse incentives can degrade systems that use them, and magic. I would be happy to find more. Please send them to me if you think of any.
Here are a few possible frames for more such myths.
Stupid Batman: Origins of An Anti-Hero
MaSLoPP is like decommissioning the bat-signal, and sending Batman door to door instead of letting him go where he’s called to help. In all fairness, this doesn’t make Batman stupid, per se. But when you’re stretched thin from knocking on every door in Gotham instead of resting up for when you get a distress signal, see how sharp you are.
In economics, we might hear effects like this referred to in terms of opportunity costs — if you send Batman door to door, that takes a lot of resources that you can’t then allocate to something else, like Batman’s time and skill. This is why some critics of MaSLoPPs have pointed out that such programs are zero-sum. Society can either invest finite resources in responding to distress calls (e.g., well-resourced investigations when there are specific child abuse concerns, or the best available cancer treatments when there is a malignancy) — or it can invest those same resources in MaSLoPP. At some point, policymakers have to make choices. Why would they choose Stupid Batman?
Maybe because a vote for Stupid Batman is a vote for conspicuous risk mitigation. The math may not shake out. But political leaders may still reap political benefits for “doing something” about major societal problems like threats to public health and safety. And people may never find out if these programs cause net harm instead of benefit. So choosing these programs can be politically rational for policymakers in terms of their own self-interest. It can also be really hard to gut programs once they exist.
Alternately, maybe policymakers just don’t understand Bayes’ theorem. Can you blame them? You have to stare out at the waves and squint. And then the insight comes and goes. At least, that’s my experience of much of statistics.
In which case, there’s hope for the decommissioned bat-signal, poor Batman, and the good people of Gotham. Just because we don’t have a legal regime governing MaSLoPP doesn’t mean we can’t make one. This is a statistics education problem.
Probably this sort of frame is most useful for explaining to a policy audience why using finite resources on MaSLoPP is often a poor choice in a zero-sum world. The risk is that people may misinterpret it as a nickel-and-dimes cost-savings argument against MaSLoPP, when really by costs we mean also possible harms to the very people we’re trying to protect with these programs. So caution is needed.
Soul Worms: The Discovery of the Root of All Evil and How to Abolish It
Credit: Sean Mckinnon.
Breaking news: New technology will save us from the rare but serious threat of spiritual larvae that are the root cause of all evil on earth. Technology gets better all the time, but our biggest societal problems stay the same: crime, disease, misinformation, and delinquency threaten the public interest. Security, health, and our very sense of shared reality — the foundations of freedom of thought and association, individual and community well-being — are at stake. Against these threats, cutting-edge tools offer very high accuracy and low false positive rates for diagnosing soul worms. This enables MaSLoPP. It looks like we are on the right track for advancing all kinds of public goods. Go, humanity!
Then I guess I have to rip off “The Lottery” in some kind of steampunk style to show why this doesn’t work out well for society, after all. You know, you say you don’t have soul worms, but the test says you do. Well that is what someone with soul worms would say, isn’t it? We’ll deal with you… [Insert horror scene involving giant trout and a woman splashing around, screaming.]
Probably this sort of frame is most useful for animating a fictional example that people can map their own contexts onto. Which is about trying to better explain foundational concepts, and how to do the calculations, without bringing in any distracting real-world considerations. There’s a place for this in science communication; but I’m not sure how many policymakers or other people can get excited about stopping mass testing for soul worms the way they can get excited about real issues they care about.
The Hand of God
We all have a tendency to forget our limitations, if we ever knew them in the first place; that’s called being human. And that’s why a/the recurrent structure in literature is the classical/Biblical/common hubris-nemesis arc. As I wrote previously (in correction #3 here):
the structure is arete, hubris, ate, nemesis – virtue, arrogance, fatal mistake, divine punishment. It’s also the structure of contemporary efforts to conduct mass screenings using highly accurate technologies with the intention of mitigating rare but serious threats. The virtue takes the form of aiming to fight baddies (spies, terrorists, pedophiles, cancer). The arrogance comes in not seeing that the literal laws of the universe (in the form of probability theory) keep us from being able to do this perfectly, even with shiny new tech. The fatal mistake is applying screening tools to whole populations instead of targeted subgroups. And the divine punishment comes from the properties of the screenings, which then produce excessively large numbers of false-positives along with non-negligible numbers of false negatives. There is no escaping math, but we can cause a lot of harm trying.
This could be retold in short form as a modern parable, possibly combining Stupid Batman and the soul worms. Maybe that would mix too many metaphors, or turn (European/academic-ish/other) people off by involving God. But there is no better metaphor for universal mathematical laws limiting what people can do with technology. No better character to say lines like: (Facepalm.) “People. Meet your limits.”
The Tree of Knowledge
For some reason, I am slow to really learn (at the level of internalizing the impulse) to use frequency trees to represent Bayesian analyses. But this is How It’s Done (TM). The canonical reference seems to be Gerd Gigerenzer and Ulrich Hoffrage’s “How to improve Bayesian reasoning without instruction: Frequency formats,” Psychological Review, 102(4), 684–704. The evidence suggests that the same statistical evidence, when presented in probability format, can really throw people for a loop; but, when presented in frequency format, can be readily understood. This makes sense, since we evolved counting bodies, not doing out Bayesian statistical calculations.
Applying this insight suggests starting with a natural frequency tree when you’re talking about Bayesian statistics makes most sense. This signals that it was bad of me to start with the formula and the waves (a case of old dog resisting new trick). It also suggests the obvious Biblical allusion — to the tree of the knowledge of good and evil (sometimes understood simply as the tree of knowledge of everything) in the Garden of Eden in Genesis. So how does this work?
It would be too literal to have a scientist chasing Eve (now an MEP/Senator) with a printed-out frequency tree showing why an imminent proposed policy might be doomed to fail according to the universal laws of mathematics — only to be knocked unconscious by a falling apple from an actual apple tree… But I’m having trouble envisioning anything better. And it bothers me that the tree is upside-down.