This is a holiday card my two-year-old son Alexander made this year, that I mailed to my old piano teacher Edie, from Berlin, Germany to Jacksonville, Florida. Edie and I used to eat chocolate and draw after finishing at the piano; this was, to me, the epitome of a good time. So now I put on music and give A chocolate when we make art. It always meant a lot to me that my mom had my art hung up on the wall at home. When my husband and I separated last year, and I started decorating my own place this summer, it meant even more to stagger A’s art everywhere. And it’s still a different level of joy to see someone else do the same.
Down with a head cold (sinusitis, brain fog) this week, I’m trying to rest and think of making home and science as iterative processes. You build home with your nuclear family, if you’re lucky growing up, and then — like a good baboon — leave to join another tribe. Among other primates, leaving that tribe to try again is generally a sign of social failure. You got a bad spot in the new hierarchy, couldn’t make enough allies to make your life liveable, and decided to roll the dice again because it couldn’t possibly be worse. But amid rising structural instability including heightened inequality and various (other) facets of societal collapse, it’s typical for people my generation (Millennial) and younger (Gen Z) to migrate again and again, from tribe to tribe, job to job, career to career… Some of us more than others.
I have made myself a tribe
out of my true affections,
and my tribe is scattered!
… says Stanley Kunitz in “The Layers,” a poem about this layering process of creation and destruction, finding and losing. I like how he calls prediction an art at the end. I’m wondering what it means that, in statistics, description and prediction are both art and science. How far can causal inferences and applying the laws of probability get us toward neutrality? Does it matter, as long as we’re getting closer? Or do we just find a truth-hold and cling onto it on the mountainside for dear life?
Revisiting PROBIT, the breastfeeding trial that wasn’t
The promise of the causal inferences revolution, and other recent scientific methodological advances, is that better tools can at least take us farther toward unbiased — if still uncertain — estimates of the truth. In this spirit, maybe I can get a little closer in explaining why, as my first breastfeeding article described in-depth, the famous breastfeeding trial PROBIT, wasn’t. I keep hearing misrepresentations of what it was and what it proved that legitimate infant feeding myths.
It’s a myth that PROBIT was a randomized controlled trial of breastfeeding. It’s a myth that it established any causal effects of breastfeeding at all. And it’s important to debunk these myths, because PROBIT may actually have generated evidence of common and preventable harm to children from the very infant feeding norms it’s often used to promote. This is a condensed (re)explanation of why.
One thing at a time (is still more than one thing)
First, the trial implemented an intervention that changed many things instead of just one. This created room for confusion. One particularly serious omission: Researchers didn’t record how long newborns went without milk in the control group. The treatment group got breastfed as soon as possible after birth, in line with exclusive breastfeeding gospel. So it’s possible that the treatment group got starved considerably less on average than the control group, because old Soviet hospital norms may have meant separating newborns from their mothers and feeding them later. This flaw is common and well-recognized in breastfeeding studies, such as Clavano’s well-known research from the Philippines. Not starving babies is probably a Very Good Idea (TM).
So any study that compares breastfeeding immediately after birth with formula-feeding whenever a factory-style hospital nursery staff in a prior era got around to it, is not only comparing breastfeeding with formula-feeding. It’s also comparing shorter to longer neonatal starvation periods (possibly, and on average in both groups). And we have good reason to suspect that this biases results in favor of “breastfeeding benefits” — which may instead really be evidence of (new norms partly avoiding) starvation harms.
You may recognize this old friend — this insight from art, cognitive science, and statistics. This is Wittgenstein’s rabbit-duck! The drawing that’s a duck if you look at it one way, and a rabbit if you look at it another.
In the case of PROBIT, two different ways of looking at the same experiment give you different — in fact, diametrically opposed — conclusions. Because if exclusive breastfeeding benefits babies, promoting it as public health policy probably makes sense. (Unless you do something totally crazy like considering mothers first, too, haha!)
But if feeding babies sooner rather than later after birth prevents starvation harm, and that’s the causal driver of observed outcome differences between PROBIT’s treatment and control groups (as well as the primary causal driver in much other breastfeeding research) — then you might want to chop the exclusive breastfeeding paradigm up into tiny pieces and feed it to rabid dogs, instead. Because it starves babies, too. Just usually fewer of them on average than the prior norm. Not starving babies at all, to review, is probably a Very Good Idea (TM).
So how do we know which we’re seeing in PROBIT and other breastfeeding studies? Benefits of breastfeeding, or harms of neonatal starvation? Rabbit or duck?
Distraction
In a social sense, it’s interesting to me that asking this question has repeatedly led to male discussants accusing me of not understanding randomization. That puts me in the position of having to decide whether to defend my expertise on experiments, which is established (e.g., with an ICPSR field experiments certificate). And whether to address the gendered dimension of these claims, which misinterpret my criticisms of PROBIT as misunderstandings of basic methods. This is a distraction.
In my opinion, both defensive responses lose in the sense that they change the focus from science to personhood. This is often what powerful institutions and actors do to subaltern ones in an attempt to delegitimate them. But it’s also taboo to say that, as individual and collective women, and as scientists and activists, we lose by calling out mansplaining (etc.) because it makes things about gender (etc.) when they are not.
These discussions are about truth, and while I cannot erase my perspective in truth-seeking as a woman, I also should not be made to choose between either flagging it or being accused of stupidity in discussions where my gender and competence are both beside the point. You can follow my argument and see if it follows the rules. You don’t have to agree with it, and I don’t have to be smart.
I have no insight to offer on gaming these sorts of social interactions, as nobody who knows me will be surprised to learn. I think my strategy, if I have one, is to talk to the people who hear what I am saying, and ignore the people who don’t.
Maybe this is hypocritical, since I’ve also argued that we need more feminist philosophy of science recognizing the importance of perspective in research methods. But I’m allowed to choose my battles, and this one is dumb. Still, I have to grind the wheels a bit to figure out what I think is going on here and why I’m responding (or not responding) in the way that I am. “That’s my story, and I’m sticking to it.”
Does it throw the game if I then say the quiet part out loud? Is this how other people solve this problem? I have no idea.
“You say you want a revolution…”
Back on the level where scientific discourse belongs, the rabbit-duck problem in PROBIT and other studies like it could be read in historical terms as a ripple of the causal inferences revolution. Before we knew more about how to do visual, structural thinking to apply logical rules like “time is linear” to the thinking underpinning statistical analyses, that thinking was not as logically sound as it can be now. On one hand, this is a revolution. On the other hand, it doesn’t get us out of being human beings with limitations like perspective.
In other words, the problem here goes beyond causal inferences — where rules of logic apply — to pattern recognition, where we’re stuck interpreting evidence with our own two eyes. This is, as usual, something Sander Greenland has written about. Visual plots are great, but “it seems underappreciated that regression curves fit to such plots are closer to subjective impressions than to objective data properties. Even when the pattern of study results is not in dispute, there is always considerable room for dispute about the source of the pattern” (p. 296). Somewhere, somehow, someone is going to have to interpret the evidence.
That’s the problem with claiming PROBIT proves breastfeeding benefits. Its findings show a pattern, but do not establish its source. Exclusive breastfeeding promotion for the treatment group hospitals wasn’t just one intervention, but a set of protocol changes. The binary variable (hospital-level randomization to this exclusive breastfeeding promotion protocol change package), wasn’t. And breastfeeding isn’t a binary variable, either…
Selection effects: Health predicts health, and vice-versa
This is another problem. PROBIT didn’t randomize mothers to breastfeed or not, since this was considered unethical. Today it’s increasingly recognized that breastfeeding problems are common. This means that you couldn’t physically randomize women to breastfeed anyway, at least not in the same way that you can randomize patients to take pills or not (terms and restrictions may apply). There is just too much selection on maternal health, and sometimes infant health (related), for this to work; women with all sorts of health problems tend to have more breastfeeding problems — no surprise there, because breastfeeding is really metabolically intensive. But when PROBIT was conducted, no one asked the women how it was going or why they stopped. Women often stop breastfeeding because it doesn’t work. But you have to think that mothers’ experiences matter to bother getting enough anecdata to think about how this might shake out.
In PROBIT, the researchers used a cluster randomized design slating some hospitals to come into the treatment group before others, since supposedly they were all raring to join the American/Western way in the wake of the collapse of the Soviet Union in 1990s Belarus (where the trial took place), and with it the top-down Soviet medical system. Take it from the Westerners heading up the study who didn’t speak their collaborators’ or subjects’ language, because it seems impossible to get a hold of any of the Belarussian collaborators themselves. A top-down medical culture in an authoritarian society transitioning to a global hegemonic paradigm with a language barrier. What could possibly go wrong?
This design likely embedded two main types of selection effects that would tend to bias results in favor of appearing to show benefits from breastfeeding, when one type of them may really show that health predicts health and the other type may really show harms from unsafe breastfeeding practices the trial unwittingly promoted. In the positive universe of the “health predicts health” selection effect, healthier moms have an easier time breastfeeding, may have healthier babies anyway for a range of reasons (genetic, epigenetic, environmental) and then it looks like breastfeeding correlates with better child outcomes — but that’s just because we’ve selected on health. No causal effect of infant feeding here.
It could also be the case, in this positive selection effect universe, that healthier moms tend to be more confident that they’ll be able to breastfeed, and try it at higher rates. After all, a number of physical abnormalities correlate with breastfeeding problems. Women know if they have these abnormalities and might logically worry about affecting breastfeeding. They’re things like asymmetrical boobs, boobs not changing normally during puberty or pregnancy, and previous breast surgery. They predict breastfeeding problems because they’re caused by things like endocrine disorder interfering with normal breast tissue formation, or tissue injury interfering with normal lactation duct connection.
In the negative universe of the “oops, I did it again (I starved the baby)” selection effect, moms with breastfeeding problems first accidentally starve their babies trying to exclusively breastfeed them because medical professionals tell them it’s important, and then switch to formula because it doesn’t work. This spikes the “formula-fed” group with (a) infants of less healthy moms who had more breastfeeding problems, and (b) infants who got accidentally starved. Again, not starving babies is a Very Good Idea (TM).
These two types of selection effects are not mutually exclusive. This means that we could have at least three things going on that explain PROBIT’s results, that are not breastfeeding benefits. In fact, two out of three of these phenomena fit the story of common and preventable harm from exclusive breastfeeding that I’ve argued other evidence may support. And the other is just health predicting health, which we can’t change.
Holds on the cliff-face of reality
Let’s put some old myths about infant feeding science to bed. There is no proof that exclusive breastfeeding benefits babies or moms. There is a wall-to-wall façade of pseudoscientific bullshit often cited to promote that paradigm, when what it really does is show how faulty a human enterprise science is. How little women’s voices have been heard in recent Western history, which we often think of as uber-progressive. And how careful we have to be in claiming only to have the hold on truth that we have really got, and saying again (and again) how it’s tenuous and we’re still climbing, still wondering where the next hold is going to come…
Sometimes these holds look more sciencey on the surface, because scientists present them in quantitative terms. Sometimes they look more subjective on the surface, because people talk about their experiences. It’s all data. One of the basic tenets of research methods is that we can put these sorts of data together to climb higher.