Science Misreporting on DHEA and Fertility in WaPo
A gentle reminder to call the rogue methodologist ninja help line as needed
Brrring brrrring!
“Hello?”
“Hello. Thank you for calling the rogue methodologist ninja help line. How can I help you?”
“Can you take a look at something?”
“Sure. [Tapping noises… ‘Whack!’ of palm hitting face… Throat clearing.] This is wrong.”
Sexy Intro
This is a post about a wonder drug that may treat infertility — but may also raise women’s and offspring’s cancer risks. It may be the female Viagra — but it may also make you so irritated with your boss that you quit your job. You have to see people as whole human beings to care about all these possible effects, and that’s not how incentives drive researchers to structure research or doctors to treat patients. This is a problem, because people need to see the best available information on all relevant outcomes before they can informed-consent to try the treatment. It’s arguably the problem of science in society today.
Nerdy Intro
This is a post about yet another instance of statistical significance testing misuse as it manifested recently across science journalism, peer-reviewed science, and (maybe) regulatory body action, and the larger universe of psychosocial and political forces to which it relates. The interesting puzzle to me, as a methodologist who can’t seem to read the news or search a database without finding something like this, is why.
One explanation is that universal cognitive biases drive this sort of mistake (Greenland). But if we can do meta-cognition as methodologists, and I’m not the only person in the world with graduate statistics training, then why is this type of mistake still so pervasive? Is change coming? Is this still a knowledge distribution problem? If so, why no rogue methodologist ninja help line? I cannot answer these questions.
But my perspective tends to be dated and can be naive: Help lines are quaint relics of our analog past. Social media really does make it possible for people who don’t know things to ask people who do for help more efficiently than ever before. You still have to know who to trust; there are probably proportionately as many bad statisticians out there are there are bad doctors and lawyers and such. Just add Internet, and you also get instant trolls.
Journalists, for their part, might be rightly afraid of being scooped if they put out copy on a Statisticians for Better Science Communication Slack channel or (worse) under a #StatsTwitter (now #StatsX?) hashtag, or something. Because if the statisticians tell you it’s wrong, and it takes you weeks to fix it, someone else could beat you to it. So maybe perverse incentives drive suboptimal science journalism just like “publish or perish” drives bad medical research and other science. Similar sorts of downward pressures on quality (informal quotas? social expectations?) could drive bad regulatory behavior. These are all cases where people may behave badly to perform well, ironically exhibiting formal incompetence to meet real-world metrics designed to improve quality (a topic relating to Goodhart’s law, and a different post).
This post proceeds as follows: First, it introduces the new case study through a quick look at recent Washington Post reporting on the hormone supplement DHEA and fertility. Then, it looks at the scientific literature the article cites. Next, it looks at the regulatory agency action the article talks about in relation to that literature. Finally, it looks at how psychosocial and political forces have shaped the narrative here. This is about the science that gets done — and the science that doesn’t get done — on women’s sexual and reproductive fulfillment. Ultimately, though, I think a gendered emphasis here would be wrong. Perverse incentives wreak equal-opportunity havoc across society. And so, going forward, this is really a story within a story about missing infrastructure that’s needed to help science, help society, and vice-versa.
Yet Another Case Study in Statistical Significance Testing Misuse
On July 30, 2023 in The Washington Post, Yeganeh Torbati published “How a renowned fertility doctor profits from an unproven supplement: There’s no definitive evidence that DHEA works as a fertility booster, and Norbert Gleicher doesn’t tell all his patients he owns a company that sells it.”
This is all technically correct, but doesn’t tell readers what they need to know: That DHEA may substantially improve live birth rates in at least some subgroups of women struggling with infertility, and you can buy it on the Internet a lot cheaper than Dr. Gleicher sells it.
In the article, Gleicher comes off as a shady character with expensive tastes and undisclosed conflicts of interest who preys on older women desperate to have babies they are overwhelmingly doomed not to have. Again, there’s nothing obviously, factually wrong with this.
But Torbati also suggests Gleicher is selling these women overpriced pseudoscientific crap. The pseudoscience framing is inaccurate.
I am not for DHEA for fertility; I am not against DHEA for fertility. I am not a doctor and don’t give medical advice. I am just a methodologist who thinks people should get the science right, and give other people more meaningful informed consent through better information.
So I say as a methodologist: It is ok to dislike people who do bad things. It is not ok to dismiss their empirical claims as pseudoscience because you don’t like them. It is worse to misrepresent evidence in line with that preferred narrative about how they are bad actors. That is bad if you are a science journalist charged with communicating the truth to the people. And it is worse if you are a regulatory agency with power over the people we would probably all agree have done bad; that is governmental abuse of power. So in my view, the inverted pyramid of people who have done wrong in this story has Gleicher at the bottom, WaPo in the middle, then some scientists who should’ve known better over the media that didn’t, and finally a regulatory agency whose job it was to know better and be fair. Let’s scale this pyramid layer by layer, before dreaming about a better world at the top.
Problem, Layer 1: Reporting
It will come as no surprise to most people (right?) that the reporting here is inaccurate. The article leaps from “no definite evidence that DHEA works as a fertility booster” in the subtitle, to stronger claims that are unsubstantiated in the body. This does not seem to happen because Torbati asked experts, they told her wrong, and she passed it on (telephone game). The experts she cites, rather, were careful to get their terms right, to talk about uncertainty rather than saying the stuff doesn’t work.
Such talk of uncertainty is not a hedge. It is a key distinction in what we know and what it means. I suspect this distinction got lost somewhere between the expert quotes that observe it, and the research citations (two of four supposedly-but-not-actually null meta-analyses) that miss it. Maybe Torbati misinterpreted the cited experts as hedging. Maybe she misread the research she cites. I don’t know.
Torbati writes:
Other researchers have done randomized, controlled trials on DHEA use in IVF, with mixed results: One found a higher live birthrate among women who took DHEA, and another found higher pregnancy rates. But a third found no difference in response to fertility drugs, or IVF outcomes, between control and treatment groups.
Four reviews and meta-analyses in 2015, 2017, 2021 and 2022 found no benefit to DHEA, especially when only the best-quality evidence was considered, while two others in 2016 and 2018 found that it improved live birth or pregnancy rates.
This summary misrepresents the cited evidence, which shows practically significant possible effects for DHEA on fertility across the board. I already knew these results; the evidence on DHEA is impressively suggestive. It’s one of the most promising current advances in women’s health.
But a subject area-naive reader might have keyed on the language of “no difference” and “no benefit” to generate the suspicion of statistical significance testing misuse. This is typical language of this mistake, in which journalists and scientists often misrepresent possible differences of practical significance as null because they don’t reach “statistical significance.”
So is this Torbati’s misinterpretation, or a telephone game mistake? To answer this question, we have to look at the science…
Problem, Layer 2: Scientific Record, One Randomized Trial and Four Cited Meta-Analyses (Supposedly) Finding No Benefit
Disclaimer: I think blinding in many research contexts reflects a common confusion of regression to the mean with possible expectancy effects, and is a dumb way to design most experiments, reflecting widespread statistical illiteracy and elitism in science (different topic). So lack of blinding is supposedly a problem in this literature. But I don’t think it matters, and ignore it.
Initially, I only bothered looking at the supposedly null meta-analyses. Then, reading this over, I decided to go back and look at the supposedly null randomized trial Torbati also mentioned. Turns out it didn’t use DHEA at all.
In other words, Torbati said randomized, controlled trials on DHEA showed mixed results, and then as proof cited a study that used a different thing (“75 IU of human menopausal gonadotropin (hMG, Menogon; Ferring Pharmaceuticals”) that didn’t work. That’s not statistical significance testing misuse. That’s just wrong.
Meta-Analysis 1: “Androgens (dehydroepiandrosterone or testosterone) for women undergoing assisted reproduction,” Cochrane Review, Nagels et al, 2015.
This review finds a live birthrate advantage for DHEA, and also notes “pre‐treatment with testosterone was associated with higher live birth rates (OR 2.60, 95% CI 1.30 to 5.20; four RCTs, N = 345, I² statistic = 0%, moderate evidence).” The baselines are different, so as always in meta-analyses, direct comparisons are dicey. But “in women with an 8% chance of live birth with placebo or no treatment, the live birth rate in women using testosterone will be between 10% and 32%” — whereas “in women with a 12% chance of live birth/ongoing pregnancy with placebo or no treatment, the live birth/ongoing pregnancy rate in women using DHEA will be between 15% and 26%.” So DHEA’s live birthrate advantage may be a bit bigger than testosterone’s, but there’s not enough data to tell.
This should ring alarm bells if you have an overview of women’s reproductive health. Polycystic Ovarian Syndrome (PCOS) is the most common reproductive endocrinological order in women. It causes high testosterone/androgens. It’s one of the most common causes of infertility for women. DHEA is metabolized into estrogen and testosterone. So we can infer from PCOS’s infertility association that raising androgens is probably not going to be good for fertility in all women. Although Gleicher et al suggest a new phenotyping of PCOS such that we should still expect it to help in some of them, that’s a subgroup of a subgroup. Lots of subgroup effects floating around here.
Point being, the subgroup of women who are getting substantial possible fertility benefits from androgen supplementation including DHEA here is probably already an atypical subgroup (women with suspicion of low androgens) of an atypical subgroup (women with fertility problems). It behaves in the exact opposite way from what we think we know about the more common, atypical subgroup (women with PCOS who have hyperandrogenism and resultant anovulation). It’s not clear why we should expect these results to generalize to the whole population of women having fertility problems.
On the subgroup issue, this review recognizes DHEA supplementation for fertility started out as a treatment to raise androgen levels in subgroups where they’re probably low:
was first reported as a treatment in ART in 2000, being used as an adjunct to IVF in women with premature ovarian failure (POF), premature ovarian aging (POA) and diminished ovarian reserve (DOR) (Casson 2000). It has been demonstrated to improve IVF outcomes in women with poor ovarian function and increased follicle stimulating hormone (FSH) levels, when administered prior to and during an IVF cycle (Barad 2006). DHEA, used as an oral preparation in a variety of doses, appears to increase the number of oocytes produced leading to improvements in pregnancy rates, both in intrauterine insemination and IVF cycles (Barad 2007; Gleicher 2011a), while reducing miscarriage rates in women with diminished ovarian reserve undergoing IVF (Gleicher 2009). Despite its widespread use as an adjuvant in ART, there remains uncertainty about the true efficacy of DHEA.
The authors say “In women identified as poor responders undergoing ART, pre‐treatment with DHEA or testosterone may be associated with improved live birth rates.” But they don’t call out the generalizability problem as well as they could. Sometimes Cochrane is bad like that; this goes to problems that have plagued the evidence-based medicine movement according to pioneers like Alvan Feinstein for a long time. You can’t leap directly from subgroup trials to either whole populations or what’s best for particular patients, even though that’s the promise of Cochrane Reviews (just read this, and then you’ll know what to do as a policy-maker or healthcare practitioner).
More to the point, the review authors don’t say there’s no benefit like Torbati claims they do. Rather, the reviewed evidence shows substantial possible benefits of DHEA for fertility.
I’m not exactly sure how Torbati’s misreading occurred. The kindest explanation is that she read the statistical results and misinterpreted them. This would involve typical statistical significance testing misuse: The reported 95% compatability/confidence intervals almost hug zero in the case of DHEA: “Multiple pregnancy data were available for five trials, with one multiple pregnancy in the DHEA group of one trial (OR 3.23, 95% CI 0.13 to 81.01; five RCTs, N = 267, very low quality evidence).” And they do hug zero in the case of testosterone: “Multiple pregnancy data were available for three trials, with four events in the testosterone group and one in the placebo/no treatment group (OR 3.09, 95% CI 0.48 to 19.98; three RCTs, N = 292, very low quality evidence).”
But the possible effects described in these intervals are still quite substantial. There is no correct way to read these analyses as showing no benefit, or this meta-analysis as reporting that. No definite benefit, maybe. For testosterone. But substantial possible benefit for DHEA and testosterone, nonetheless. What we care about practically here is the potential; it doesn’t make sense to dismiss it because the reality is uncertain.
In contrast with Torbati’s misrepresentation, the authors conclude:
In women identified as poor responders undergoing ART, pre‐treatment with DHEA or testosterone may be associated with improved live birth rates. The overall quality of the evidence is moderate. There is insufficient evidence to draw any conclusions about the safety of either androgen. Definitive conclusions regarding the clinical role of either androgen awaits evidence from further well‐designed studies.
A maybe is not a no. The practical difference here is between a “we might be able to help you” and an “it’s hopeless.” This misinterpretation hurts women.
Meta-Analysis 2: “The effect of dehydroepiandrosterone (DHEA) supplementation on women with diminished ovarian reserve (DOR) in IVF cycle: Evidence from a meta-analysis,” Qin et al, Journal of Gynecology Obstetrics and Human ReproductionVolume 46, Issue 1, January 2017, Pages 1-7.
This is basically the same story. This meta-analysis is better than the first about saying its underlying trials deal with a subgroup: Patients with DOR (diminished ovarian reserve). (Cochrane’s trials probably did, too, but they didn’t say.) Reviewing evidence on DHEA’s effects on ovarian response and pregnancy outcome in DOR patients, it finds
clinical pregnancy rates were increased significantly in DOR patients who were pre-treated with DHEA (OR = 1.47, 95% CI: 1.09–1.99), whereas no differences were found in the number of oocytes retrieved, the cancellation rate of IVF cycles and the miscarriage rate between the cases and controls (WMD= −0.69, 95% CI: −2.18–0.81; OR = 0.74, 95% CI: 0.51–1.08; OR = 0.34, 95% CI: 0.10–1.24). However, it is worth noting that when data were restricted to RCTs, there was a non-significant difference in the clinical pregnancy rate (OR = 1.08, 95% CI: 0.67–1.73). We concluded that DHEA supplementation in DOR patients might improve the pregnancy outcomes. To further confirm this effect, more randomized controlled trials with large sample sizes are needed.
Again, the evidence reviewed shows substantial possible benefit of DHEA on fertility in this subgroup, including when the data were restricted to RCTs. Torbati misrepresents what it says. Again, it could be statistical significance testing misuse, because the CI nearly hugs zero, and then does hug zero in the restricted analysis. Maybe confirmation bias driven by understandable dislike for Gleicher also played a role. In any event, Torbati misrepresented these first two cited meta-analyses as finding no benefit, when they found substantial possible benefits of DHEA on fertility.
Meta-Analysis 3: “The Use of Androgen Priming in Women with Reduced Ovarian Reserve Undergoing Assisted Reproductive Technology,” Richardson and Jayaprakasan, Semin Reprod Med. 2021 Nov;39(5-06):207-219.
A clue that Torbati’s summary of the meta-analytical literature was mistaken: The abstract in this article says “Meta-analyses have consistently reported that DHEA does appear to significantly improve IVF outcome in women with predicted or proven poor ovarian response (POR)…”
But the authors worry those previous analyses might have been biased by including “some normal responders and/or nonrandomized studies.” This is an interesting twist on the generalizability problem: It suggests that the authors entertain the possibility that DHEA helps normal responders — perhaps even more than it helps women with DOR — even though we know that heightened androgens in women also commonly cause infertility. This seems to be a common idea in the reproductive medicine field today. But the fact is that we know it can work both ways, we don’t know how it plays out here, and so we need to account for that in thinking about subpopulations and causes first, and then running analyses.
Anyway, this is the first of the four cited meta-analyses that Torbati says show “no benefit” that actually does say that… At least in the sense that the authors say that (although their evidence doesn’t). In the abstract, for instance, they write: “Our meta-analyses including randomized controlled trials (RCTs) incorporating only women with DOR or POR suggest that DHEA confers no benefit.”
The evidence reported in the article, on the other hand, actually shows something else. For DHEA:
Based on a meta-analysis incorporating eight studies and 878 women, DHEA was associated with improved live birth/ongoing pregnancy rates in women undergoing IVF (odds ratio [OR]: 1.81, 95% confidence interval [CI]: 1.25–2.62). DHEA was also associated with improved clinical pregnancy rates (OR: 1.34, 95% CI:1.01–1.76) based on a meta-analysis incorporating twelve studies and 1,246 women… In a sensitivity analysis removing the studies at high risk of performance bias, the associations between DHEA and live birth/ongoing pregnancy (OR: 1.50,95% CI: 0.88–2.5, five RCTs, n = 306, I2 statistic = 43%) and DHEA and clinical pregnancy (OR: 1.11, 95% CI:0.69–1.79, six RCTs, n = 337, I2 statistic = 0%) were both lost. However, of the studies included in this subgroup analysis, three incorporated women with normal ovarian reserve (and are therefore less representative of the population for whom DHEA is intended) and one actually had an unclear (not low) risk of bias. In a sensitivity analysis removing the studies which incorporated women with normal ovarian reserve, DHEA was associated with significantly improved clinical pregnancy rates (OR: 1.44, 95% CI: 1.06–1.94; 10 RCTs,n = 1,122, I2 = 0%).
…Although only one individual study suggests that DHEA significantly improves clinical pregnancy rates following IVF in women with a predicted or prior POR, all the meta-analyses appear to support this finding (►Table 3). Furthermore, the network meta-analysis published in 2020 by Zhang et al not only suggested that DHEA improves the probability of achieving a pregnancy in women with POR undergoing IVF but also that it is superior to a whole host of other proposed adjuvant treatments including Coenzyme Q10, hCG, testosterone, growth hormone, oestradiol, letrozole, recombinant LH, and clomiphene. We undertook a meta-analysis incorporating the eight published RCTs that compared DHEA with placebo or no treatment in women with predicted or proven POR (►Figs. 2 and 3). It showed similar clinical pregnancy rates (OR: 1.28, 95% CI: 0.90–1.82) and live birth (OR: 1.27,95% CI: 0.63–2.54).
Statistical significance testing misuse runs throughout: First, where the researchers say the associations were lost, but the CIs say otherwise. Next, when the sensitivity analysis on just DOR women showed “significantly” improved clinical pregnancy rates, again the CI is actually pretty close to zero on the lower bound. Then, the authors represent their meta-analysis results of eight RCTs of DHEA in women with DOR as showing “similar clinical pregnancy rates,” but the CIs show substantial possible benefits — particularly on live births.
Similarly for testosterone:
We undertook a meta-analysis incorporating these six studies that compared testosterone with placebo or no treatment in 443 women with predicted or proven POR (►Figs. 4 and 5). It demonstrates significantly increased clinical pregnancy (OR: 2.40, 95% CI: 1.34–4.31) and live birth (OR: 2.27, 95% CI: 1.17–4.41) rates in women who utilized testosterone. However, after excluding the studies at high risk of bias, no difference in either clinical pregnancy (OR: 1.46, 95% CI: 0.38–5.61) or live birth (OR: 1.26, 95% CI: 0.26–6.07) between the two groups was observed (►Figs. 6 and 7).
Statistical significance testing misuse again: First using “significantly” in a positive way, overstating the implications of positive point estimates instead of interpreting full interval estimates. And second misrepresenting substantial possible difference as “no difference,” making the same mistake in the other direction. In both passages, the thresholding misuse works both ways, as is typical of statistical significance testing misuse.
So this is the third of the four cited meta-analyses supposedly showing no benefit that really shows substantial possible benefit rather than none. But you have to know that the researchers misinterpreted their results to catch this one. The next one falls in this latter category, too. This is not surprising; this is a common methods mistake across science and medicine.
Meta-Analysis 4: “Androgens and diminished ovarian reserve: the long road from basic science to clinical implementation. A comprehensive and systematic review with meta-analysis,” Neves et al, American Journal of Obstetrics and Gynecology, Volume 227, Issue 3, September 2022, Pages 401-413.e18.
The authors report finding “no significant differences” with DHEA priming, but benefits for testosterone pretreatment:
No significant differences were found regarding the number of oocytes retrieved (mean difference, 0.76; 95% confidence interval, −0.35 to 1.88), mature oocytes retrieved (mean difference, 0.25; 95% confidence interval, −0.27 to 0.76), clinical pregnancy rate (risk ratio, 1.17; 95% confidence interval, 0.87–1.57), live-birth rate (risk ratio, 0.97; 95% confidence interval, 0.47–2.01), or miscarriage rate (risk ratio, 0.80; 95% confidence interval, 0.29–2.22) when dehydroepiandrosterone priming was compared with placebo or no treatment. Testosterone pretreatment yielded a higher number of oocytes retrieved (mean difference, 0.94; 95% confidence interval, 0.46–1.42), a higher clinical pregnancy rate (risk ratio, 2.07; 95% confidence interval, 1.33–3.20), and higher live-birth rate (risk ratio, 2.09; 95% confidence interval, 1.11–3.95).
The language “no significance differences” flags possible statistical significance testing misuse, which the results confirm. The 95% CIs for live birth and miscarriage rates in the DHEA group show substantial possible benefits. It’s just that testosterone’s possible benefits appeared higher, and the results of the statistical significance test reflect that. So it would be more accurate to say that both treatments show possible benefit, but testosterone may trump DHEA for this subpopulation (women with DOR).
Again, we wouldn’t expect that to necessarily generalize beyond this subgroup. Boosting testosterone in women with normal to high androgens might be expected, instead, to cause infertility.
And remember, the first cited, supposedly null meta-analysis found the same overall thing — that both DHEA and testosterone may help… But with the opposite twist — DHEA looked possibly more helpful than testosterone, more research needed. One of the two having a nose over the other in one meta-analysis or another at this stage of the research doesn’t mean much.
Problem, Layer 3: Regulatory Agency Action, One Wrist-Slapping Letter
Back in WaPo, Torbati wrote:
In 2021, the FDA and FTC sent a joint warning letter to Gleicher. The letter states that Fertility Nutraceuticals’ website claimed its DHEA product “has been show [sic] in multiple studies to … reduce miscarriage risk,” and along with another supplement, can “boost your chance of pregnancy or improve your IVF success rate.”
“The FTC is concerned that one or more of the efficacy claims cited above may not be substantiated by competent and reliable scientific evidence,” the letter said.
Gleicher said the letter was “totally unfair,” because “there was nothing on our website that A, we didn’t believe in, but more importantly, that we did not have … solid data for.
This all seems right on the surface. Yet, like Torbati’s title, it still misrepresents the big picture of what’s going on here that we care about. The FTC sent Gleicher a four-page letter. The first three pages take issue with the fact that he was promoting supplements to be used as drugs — citing as evidence the fact that he shipped medical literature along with a sale. This is a bizarre misbranding attack.
People buy DHEA — and other substances that are controlled in many other countries — off U.S. Amazon. They do this to use them for medical purposes. It’s business as usual.
This also happens with more mundane substances in countries with better regulatory regimes. For instance, I use an iron supplement I buy off the Internet for medical purposes (anemia prevention). This is part of normal commerce and self-care.
Point being, the lines are blurry. And online sellers aren’t typically shipping people medical literature with their personal care products.
So the FTC’s main problem seems to be that Gleicher’s a doctor citing medical literature on medical uses for something he uses in his medical practice and medical research. Because what you really want as a regulatory agency is instead for companies to sell the same stuff without reference to medical literature on the Internet. Who wins this strange game?
Back on my main meat, the FTC’s science concern gets thrown in via one boilerplate paragraph on page four, almost as an afterthought. They don’t cite any evidence for their concern that “one or more of the efficacy claims cited above may not be substantiated by competent and reliable scientific evidence.” I see four possible explanations:
Possibility 1: FTC Misinterpreted the Evidence (Statistical Significance Testing Misuse). They could be making the same old statistical significance testing misuse mistake that two of the four WaPo-cited “null” meta-analyses (none of which were actually null) made. It happens in scientific discourse all the time. It probably happens in science policy contexts, too, but there is ironically less transparency in this realm to be able to see the details behind this type of letter and thus be able to call out this kind of mistake. Insert rant about democratic institutions’ anti-democratic non-transparency through bureaucracy here.
I think this is the most generous possible explanation, but the first three of four aren’t mutually exclusive…
Possibility 2: FTC Threw in Science Boilerplate (Confirmation Bias). Maybe they had a heated water cooler conversation about Gleicher, and sent him a letter with no devil’s advocate follow-up discussion to check their bias. Maybe they don’t even have an argument. Maybe they don’t have to.
Possibility 3: FTC Had Three Corrections in Mind (But Wasn’t About to Suggest Them). Gleicher should have said “may” instead of “can” when talking about DHEA’s possible benefits. That’s more consistent with the evidence, which is inconclusive, but does reliably appear to show substantial possible fertility benefits from DHEA supplementation.
He also should have been clearer that distinguishing different subgroups of women who respond better to different supplements is something that ongoing research should still be addressing.
And he should have sprinkled boilerplate disclaimers more liberally about how his products “are not intended to diagnose or treat any medical diseases, but only to help patients support well-being through lifestyle modifications. If you think you may have a medical problem or require medical treatment, please consult personally with a physician.” Or something. You know, the same stuff wellness quacks always say to get away with selling snake oil. Maybe Gleicher didn’t know the right wellness quack boilerplate, because he’s actually a doctor. Maybe the wellness quack boilerplate only protects you against the FTC’s science boilerplate if you’re in good graces with them already.
Look, the FTC is not sending letters to everyone who runs a website that says “can” where it should really say “may.” So this is technically an important correction, but also consistent with possibility 2 (confirmation bias/going after this guy).
Possibility 4: FTC Has A Team of Rogue Methodologist Ninjas. They have come up with crack criticisms I haven’t thought of. They expect Gleicher to intuit these criticisms in order to address them, which makes no sense if what you want as a regulatory agency is for people to improve instead of shutting down. I’m curious what these criticisms might be. FTC: Methodologist Doge is standing by for your call.
Problem, Layer 4: Better Research Agendas
I know I said I could not answer the question of why no rogue methodologist ninja help line. I do wonder why states don’t offer such a service to promote quality in science, science communication, and informed consent. Maybe they do, and I missed it? Can you call an operator and be connected to a university data science lab where Methodologist Doge helps check your work? They say everything exists somewhere on the Internet (Rule 34). Maybe there’s a rogue methodologist ninja chatbot who at least gives automated advice on not making the same statistical significance mistakes again and again.
Anyway, I do have some idea what better research agendas might look like, and why we don’t have them already. The general theme is more power to the people and less perverse incentives. The big-picture critique is that power shapes science, and the constructive edge to it is that we don’t have to let it do that quite so much. We just need more critical science reflecting on how power shapes stories, plus better infrastructure for better bringing science to society and societal engagement to science.
Lack of infrastructure isn’t stupid or evil. You just see there’s no road where there should be a road. And eventually somebody builds one.
Point 1: We need better citizen science infrastructure to optimize net results from DHEA use for fertility.
There is a lot of informal clinical and personal use of DHEA (where available) for fertility. But the people trying it are largely not in the subgroups that the best available evidence deals with. They are regular women with fertility problems, not women with diagnosed reduced ovarian reserve (a small subgroup) undergoing IVF (another subgroup in that subgroup).
So there is a spread of possible outcomes here. Best is if DHEA works for regular women with fertility problems anyway. We have a prima facia story about why that might be: DHEA goes down with age, so raising it may improve egg quality, which may reduce miscarriages, making healthier pregnancies and more live births. (But we have evidence suggesting that may not actually be the case: At least one subanalysis excluding regular women, focusing only on androgens in women with DOR, seemed to yield stronger results, like some experts expected.) Worst is if DHEA doesn’t work for regular women — and gives them cancer.
Instead of randomized trials to figure this out, we have a lot of desperate women using hormones that may or may not help them have babies and may give them cancer. But we won’t get to see the baby and the cancer outcomes on the same timescale. And it’s not clear that women who take DHEA and go on to develop cancer will go back to their fertility doctors or call the FTC/FDA DHEA safety monitoring line (which doesn’t exist) or whatever to report the possible adverse event. This seems suboptimal.
Why not have this citizen science platform I keep talking about, to help people design and participate in randomized experiments on things like this? If scientific institutions aren’t going to do it, and people are going to experiment anyway, then don’t we want the data, so future people in the same position may get more effective care with better informed consent and have the option to less preventable harm?
Since there are established, dose-dependent cancer risks from estrogen in other contexts (which I wrote about in far too much detail previously, parts 1 and 2), it might make sense to try to figure out if a lower dose of DHEA than the usually trialed 75 mg/day is just as effective but safer. Comparing 25 mg/day and 50 mg/day would be a simple 2x2 experimental design that people might actually participate in, since everyone gets the treatment, and the lower-dose group can always go up later if the treatment fails and they want to. (At least in the U.S., where, again, this stuff is inexplicably over-the-counter. I love you, America; don’t ever change.)
Maybe a registry or a survey is a better idea. I don’t care what form the research takes; I care that something is missing in the knowledge production ecosystem for people who are taking risks in this vein already, to do so in a structured way so we can learn. I seem to be the only one who wonders every day why there is no citizen science platform for experimental and survey research like this. But maybe if I say it enough times, it will bother other people, too.
More generally, there is a lot of risk-taking and not a lot of systematic data-collecting on it, which seems odd in the information age. The “quantified self” movement reflects atomization that ultimately doesn’t serve people’s interests; companies’ abilities to buy massive amounts of personal data on you reflects power asymmetries that don’t serve the public interest. If we treat this odd disparity as a flukey problem of missing infrastructure, even though it’s power-laden, then maybe we can shift those information asymmetries.
But maybe the reality of research is that it is too complex and requires too much specialist knowledge to do well. So maybe we are stuck with the inefficient institutions we have, with no science democratization effectively possible.
Point 2: If we had better citizen science infrastructure, we could also use it to optimize medical research and practice supporting women’s sexual fulfillment, which is just as terrible a mess as the rest of science and society.
Sorry to bury the lede here, but DHEA could also be the female Viagra. Of all these women taking it for fertility, some of them are presumably having sex. They’re probably having better sex, but we’re not getting data on that. Maybe we should. Hear me out.
A lot of people mistakenly think that testosterone is the ultimate sex hormone. And it can certainly influence libido. But women need estrogen for sexual and reproductive function, too. If you just supplement testosterone in people with vaginas who are low on estrogen or actively blocking it (like in the trans population), they wind up having painful sex, probably from vaginal atrophy and dryness. (This also seems potentially psychologically problematic: If your vagina hurts, it probably doesn’t help with gender dysphoria.)
Sadly, this probably also plays out in postmenopausal women with sexual dysfunction — who are sometimes offered testosterone supplementation for low desire and lack of orgasm. Testosterone fixes the desire problem. But for people with vaginas and low estrogen, it does so at this terrible and not widely understood price. How many postmenopausal women have tried testosterone to improve their sex lives, and predictably developed vaginal atrophy and dyspareunia (painful sex) instead? Medicine is doing a really terrible job at helping women have better sex, and this is part of that story.
By contrast, DHEA converts to both testosterone and estrogen metabolites. This is, hormonally at least, what women need to have both sexual desire and healthy vaginas to do a certain something about it.
But there is a catch: DHEA seems to convert differently transdermally (more testosterone) versus vaginally and orally (more estrogen). Its only FDA approved use is vaginal (for vaginal atrophy). Its fertility use is mostly oral, but we don’t know from randomized trials if there’s a lower effective dose than the standard 75 mg/day for women with DOR, or how route of administration changes efficacy (for reproductive versus for sexual functioning purposes) and safety in this and other groups. We just know that too much testosterone seems to be bad for fertility and sex in women without DOR. So this is a kind of Pharmakon where the medicine (DHEA) may be poison for normal women’s reproductive/sexual function in one form (transdermal — too much testosterone), off-label medicine in another form (oral — possibly risky estrogen exposure here), and approved medicine in another (vaginal).
In other words, oral and vaginal DHEA may be the female Viagra. But we don’t know how they compare on sexual and reproduction function effects, or how much they increase the risks of hormone-sensitive cancers, particularly breast cancer. And there’s apparently not randomized trials underway to figure out how possible benefits and risks net out. (There’s plenty of DHEA research; just mostly on other things.)
The irresistible joke: If the genders were flipped, Pfizer, Johnson & Johnson, and NASA would be running Phase 3 trials on patented DHEA offshoots from Australia to the moon. Someone would be studying whether inositol or Metform mitigates the possible breast cancer risk (ok, Metformin — even though it is horribly tolerated and probably no better, but it’s a drug and so yields more profit). Someone would be comparing transvaginal and oral DHEA for reproductive and sexual outcomes as well as cancer risks. Someone would be testing exactly how little DHEA may be effective, because we want to minimize exposure to known carcinogens (hello estrogen), especially when embryos may also be exposed to them.
To be fair, though, the bigger picture of messy psychosocial and political realities is a lot less sexy and facile than Feminist Warrior Princess vs. Big Pharma Corruption. For instance, boys and young men are at heightened risk for possible cardiovascular complications like myocarditis from Covid vaccination (see example 2). Ivermectin and nicotine weren’t as profitable to research as possible Covid preventives and treatments as new pharmaceuticals, and the differential treatment of evidence on them similarly reflects that (see also example 1). Overall, perverse incentives contribute to bad science and science policy — and misinformation about it. They are equal-opportunity threats to health and informed consent, and we need to recognize that in order to (pretty much literally) fight the power.
Also, DHEA-Viagra is a false equivalence, because Viagra was discovered by accident and the result was easy to see. So let’s not play “divide and conquer.” Let’s just start doing better research on women’s sexual as well as reproductive fulfillment.
All I’m saying is, it’s not a huge leap from “DHEA treats vaginal atrophy/painful sex and increases testosterone, which increases desire” to “DHEA may improve women’s sex lives.” Whereas there’s cause for concern that testosterone (part of current standard of care in some places for low desire/sexual dysfunction in postmenopausal women) may be hurting some women’s sex lives, instead. But there’s not a lot of ongoing research here. Why not isn’t a super interesting question, because the answer is so sadly normal…
Pharmaceutical companies don’t stand to generate huge profits from DHEA, because it’s a hormone and they can’t patent it. So there are no randomized controlled trials on this right now. If DHEA were a drug, there probably already would have been a ton of trials. There would be ads to ask your doctor about it, complete with warnings to seek medical attention if you experience marital harmony lasting longer than four hours. Sometimes perverse incentives mean promising potential treatments go largely ignored, and that may even be a good thing if they turn out to cause net harm anyway. We really need to know more about possible cancer risks here.
Point 3: It would be cool if it were normal for scientists and science journalists to report results in terms that people can understand them. Like frequency format outcome tables counting bodies, instead of aggregated effect size estimates or probability figures that just don’t translate well into practical significance for a broad audience, and no wonder. Someone should translate the results from the cited DHEA-fertility meta-analyses into that format, and summarize it as a range of estimated numbers needed to treat to make a baby; but that someone is not me in this post today.
The journals that published those meta-analyses arguably should have required this. It might have prevented the journalistic misrepresentation of their results. The FTC arguably should require this sort of fact box showing estimated possible benefits versus harms be mailed out with every bottle of DHEA and every tube of other hormones you can buy on Amazon in the U.S. But it doesn’t look like they’re into that. Apparently, that would be way worse than selling people these hormones without including potentially relevant medical information.
Point 4: It would be nice to know what people are actually doing and how it is working out for them more broadly.
There is a lot of variation in fertility medicine practices, as in other areas of medicine, science, and life. Supplemental estrogen is still commonly prescribed along with progesterone in the luteal (post-ovulation) phase, even though it’s not clear how its net risks and benefits compare with those of DHEA in different subgroups, and we know it’s a carcinogen. I hope no one is still prescribing synthetic progesterone in fertility treatments, because it doesn’t help and may raise maternal and offspring cancer risks quite substantially; but maybe some people still do that, too. I don’t see a good survey of global practice on this. Probably a lot of people self-experiment, and it would be cool to get more of that data, too.
Overall, the perverse thing about the current lack of information infrastructure here is that some people (especially fertility doctors) profit while women and their possible children take on the risks. So the health costs of hormonal treatments for fertility are privatized on women’s bodies, and future women aren’t getting to benefit from learning from their experiences like they should.
I really wonder what informed consent looks like at fertility clinics about the possible risks of preconception supplemental hormonal exposures for offspring. Anecdotally, I’ve only surprised and horrified people by asking about this; the patients seem pretty uninformed. The problem in informing them is that we don’t know what these risks are. And they may be quite substantial. Do desperate women then get to gamble with their future possible children’s health, to make them exist if they can?
Worse, maybe regulatory agencies aren’t working on solving this problem, because their clients include citizens like customers of these clinics. People who want to keep getting these treatments, safety and efficacy aside. The science has gotten so lost in translation that democratic oversight of regulatory agencies may actually pose obstacles to effective regulatory action. No one with power seems to be interested in defending potential offsprings’ interest in not increasing their possible cancer risks.
Can We Cut Out the Middlemen?
My larger point is that we may want to consider cutting out the middle-men between people and science where we can. This sort of constant repetition of basic methodological mistakes makes me want to not staff my own humble SubStack version of the rogue methodologist helpline anymore, because I have more interesting problems to study. This is a distraction, and I struggle with those enough. If reporters, scientists, and regulatory agencies can’t seem to get this one thing right — interpreting evidence without misusing statistical significance test results — then can’t we have infrastructure that does it for them? If people need better information on which to base better decisions, then shouldn’t scientists ourselves work on giving it to them in ways that this common misinterpretation doesn’t happen? Otherwise there are just more steps, more places in the chain for the telephone game to go awry.
It’s at least heartening to see that the problem here wasn’t scientists misleading a journalist. The experts in Torbati’s piece spoke carefully and correctly as far as I saw about the uncertainty at issue when it comes to DHEA and fertility. But their message got distorted in service of an incorrect preferred narrative about what the science said in the end.
That spin doesn’t serve women. It doesn’t even make for a sexier article than “wonder drug may treat infertility, tighten vaginas, increase sexual desire, and help you quit your job.” It was just an easier story to write. The narrative of a villain and his snake oil was there for the taking. Torbati took it, probably with the good intentions of protecting women from a corrupt quack.
So the reporter and her fact-checker made mistakes and misrepresented the science including on the most important takeaway point — DHEA probably solves some women’s fertility problems, we’re just not sure how many or which ones. But the blame doesn’t rest entirely on them. At least some of the scientists whose work they dealt with (like the authors of the last two of four cited null-but-not-meta-analyses) could have communicated their results’ real-world significance better. Institutional science doesn’t put a premium on science communication. Scientific publishing is a morass of corruption and ineptitude (burn it down). Too much administration and not enough time for quality degrades both. So some error and cloudiness filters down from science to science journalism. That’s not Torbati’s fault.
Maybe there’s no solveable problem here. Maybe miscommunication is just what people do with words. Maybe the problem of science in society is one of inescapable middle-men (telephone games). Or maybe part of the problem is that people don’t have an easy way to participate more directly in making the science that they need. Could we not have infrastructure that gives it to them? Instead of wishing that people who don’t seem to understand stats would just call a statistician, could we not make some platforms that help people structure their observations, analyses, and interpretations more accurately and transparently, so that accidental but widespread mistakes are less likely?