Declining childhood vaccination rates risk endemic measles and other formerly banished childhood diseases, along with large numbers of fully preventable hospitalizations and deaths, as Kiang et al’s recent JAMA model suggests. Measles vaccination looks like a textbook case of equilibrium effects, but what are those and how do we think about modeling them? Is this going to mean vaccine mandates? Or is it a lost cause in a “post-herd immunity world”? Is this going to hurt? (My kids??) Notes on a few recent readings…
tl;dr —
Besserve & Schölkopf = modeling interventions in equilibrium systems
Massidda et al. = abstraction theory under soft interventions
Eberhardt & Scheines = hard vs. soft interventions in causal inference
Dash = foundational critique of DAGs in equilibrium settings
Weinberger = philosophical engagement with when not to intervene
“Learning soft interventions in complex equilibrium systems,” Michel Besserve and Bernhard Schölkopf, Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022), PMLR 180:170-180.
The problem: “Complex systems often contain feedback loops that can be described as cyclic causal models. Intervening in such systems may lead to counterintuitive effects, which cannot be inferred directly from the graph structure.” Meanwhile, “a priori intuitive interventions in [complex systems with feedbacks] may lead to paradoxical outcomes” (p. 170). So methodologists have struggled to apply insights from the causal revolution to complex equilibrium systems. Which include basically all human relations.
The solution: “a differentiable soft intervention design framework for general equilibrium systems” (p. 178) building on recent deep neural network research that produced gradient descent based optimization of equilibrium models (p. 172; “Deep Equilibrium Models,” Bai et al, NeurIPS Proceedings, Advances in Neural Information Processing Systems 32, 2019; see also video and other video). Soft interventions are “interventions that do not change the causal structure,” and proponents argue they “provide a more realistic account of changes that can be performed in real life systems” (p. 170). (The contrast point is hard interventions that change the structure of the graph, citing Eberhardt and Scheines, 2007.)
The authors take the example of transitioning to more sustainable economies, where energy efficiency gains may net increase energy consumption due to increased demand as technology improves (citing Jevons 1866 and Brockway et al 2021). They cite previous researchers’ theories that unanticipated behaviors like rebound effects (Wallenborn 2018) “may reflect balanced causal relationship designed by evolution [Andersen, 2013] and feedback loops [Blom and Mooij, 2021] that maintain a system at an optimal ‘equilibrium’ operating point independent from external perturbations, challenging classical causal inference assumptions of faithfulness and acyclicity” (p. 170). In other words, some complex systems might resist structural change due to embedded feedback loops or evolved equilibria. So working within the underlying causal structure of the world using soft interventions, instead of trying to change the existing causal dynamics through hard interventions, may work better.
It’s worth noting that this is coming from the bleeding edge of causality and machine learning. Schölkopf, who heads the Department of Empirical Inference at the Max Planck Institute for Intelligent Systems in Tübingen, Germany, has been introducing machine learning labs to the causal revolution since 2017 (see, e.g., paper; video; slides), and the world has seen the results. For instance, following Schölkopf’s talk on causality, the machine learning lab that became Google DeepMind went on to greatness: The 2024 Nobel Prize in Chemistry went to Google DeepMind co-founder/CEO Sir Demis Hassabis and Director Dr. John Jumper for AlphaFold, the 3D protein folding AI powerhouse that just put a generation of biochem graduate students out of dissertations. (David Baker was also co-awarded the prize for the computational design feat of building new kinds of proteins.)
So maybe learning soft interventions is the right tool for the cyclic causal model job. But the problem with intervening in cyclic causal systems in equilibrium seems to be that we can’t predict net effects, and this application doesn’t seem to solve that problem. Rather, “Further work in this direction will need to address identifiability of the considered models from observational or experimental data.”
In other words, even if we can optimize soft interventions in equilibrium systems, we still don’t know how to reliably tell, from real-world data, what causal model we’re operating in. Without identifiability, we don’t know what system we’re intervening in, hindering prediction.
This is the same problem I’ve been running into drawing causal diagrams for security programs including the proposed EU digital communications surveillance program “Chat Control.” Security affects liberty and liberty affects security in a complex, cyclic causal structure. So the ways we currently know to do cost-benefits analyses of such programs don’t work, because they can’t account for all causal mechanisms including feedbacks.
This seems to suggest that the current state of science is: “An entire class of efficacy analyses is wrong.” But that we have maybe not yet arrived at: “And here’s the corrected version.”
“Causal Abstraction with Soft Interventions,” Riccardo Massidda, Atticus Geiger, Thomas Icard, and Davide Bacciu, Proceedings of Machine Learning Research, Vol. 213, p. 1–20, 2023.
Additional specifications of soft and hard interventions here: Hard denotes fixing a variable to a single value (citing Pearl, 2009); soft, formalizing “the replacement of causal mechanisms with different and possibly non-constant mechanisms (Eberhardt and Scheines, 2007) (p. 1). This is more realistic. Massidda et al give the example of increasing network weights in a deep learning model, but most people will be more familiar with policy-making or biological systems contexts in which reality is messy.
The mechanics of the contribution: Massidda et al extend existing theories of abstraction from hard to soft interventions by enforcing causal consistency for all exogenous and endogenous settings, a strengthened requirement over a strict generalization of τ-abstraction which generates their new definition of soft abstraction. Soft abstraction offers a more flexible, approximate form of causal abstraction that’s more realistic in the context of complex systems like neural networks and societies, where an intervention at one level may neither stay at that abstraction level nor have predictable fixed effects at another level.
It would be easy to confuse this with deliberately lossy transformations in related areas from neural networks to fuzzy logic, where the correspondences between truth and reality may be partial, graded, or context-sensitive. (“Lossy” is a common term in information theory and signal processing, best known from compression: everybody knows that “lossy” formats like JPEG or MP3 trade fidelity for efficiency.)
But the opposite is actually going on here: Massidda et al tightened the conditions under which abstraction is permitted. However, it remains to be seen whether or when these stricter conditions can be satisfied in practice to produce better causal models or more useful predictions.
“Interventions and Causal Inference,” Frederick Eberhardt and Richard Scheines, Proceedings of the 2006 Biennial Meeting of the Philosophy of Science Association, Vol. 74, No. 5, p. 981-995, 2007.
The source of the hard (structural) versus soft (parametric) interventions distinction:
The work we present here describes two sorts of interventions (“structural” and “parametric”) that seem crucial to causal discovery. These two types of interventions form opposite ends of a whole continuum of ‘harder’ to ‘softer’ interventions (p. 2).
Hard interventions set a variable to a value, as in randomized controlled trials. Eberhardt and Scheines argue that, while hard interventions are powerful for identifying causal structures, they may not always be ethical or feasible. (A generally accepted take on the power and the limits of RCTs.)
By contrast, soft interventions modify the causal mechanism or parameters without severing the causal links. This is generally more feasible in complex systems where we don’t have full control over key variables. It “has the advantage that causal structure that would be destroyed in structural interventions might be detectable, but it has the disadvantage that an association due to a (potentially unmeasured) common cause is not broken” (p. 8).
When we spoke, Naftali Weinberger, a postdoc at the Munich Center for Mathematical Philosophy (see below), suggested that the usual understanding is that hard interventions fix variable values, while soft interventions change their distribution (conversation May 9, 2025). He thinks this characterization reflects not disagreement but a matter of emphasis.
(General note: The authors assume acyclicity, so feedbacks are out of scope (p. 3).)
Denver Dash, “Restructuring Dynamic Causal Systems in Equilibrium,” Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, The Proceedings of Machine Learning Research, R5:81-88, 2005
Feedback loops involve equilibria relations, and those mess with DAGs (directed acyclic graphs). They mess with them so badly that some people argue we shouldn’t use causal graphs with equilibrium data at all. A seminal paper on this is Denver Dash’s 2003 University of Pittsburgh dissertation, Caveats for Causal Reasoning with Equilibrium Models. (Mad props for blowing up graph theory with his diss; but a diss is too long for me to read right now, so I’m summarizing a short paper offshoot.)
Dash finds that sequencing matters (“Restructuring Dynamic Causal Systems in Equilibrium,” Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, The Proceedings of Machine Learning Research, R5:81-88, 2005). Specifically, the sequence of equilibrating and manipulating a system can hugely influence its behavior. They’re not commutative. The equilibrated-manipulated and manipulated-equilibrated models aren’t identical.
This non-commutativity undermines DAG-based causal inference and forces modelers to choose between different equilibrium orderings, which can yield different predictions.
This looks like a serious obstacle to causal discovery with DAGs, because feedbacks affect most if not all of the systems we care about. Homeostasis? Feedbacks. Security? Feedbacks. Markets, be they for money or for love? You get the idea. If it matters to humans, it seems like feedbacks are part of the process.
If equilibrating and manipulating were commutative, then we could still do causal discovery on observational data using DAGs. But because they’re not, we have to be careful to first manipulate and then equilibrate the model, and even so watch out for likely violation of the assumption that models are identical when they’re not. We can do that by carefully selecting what variables to include in the model (lest it produce false predictions); but this looks like a solution of limited practical use when it involves omitting important things.
Anyway, Dash found the dynamical model privileged one of the two equilibrium models. Putting the “sequential” in the “modeling sequential data” problem that deep equilibrium models address. (This is intuitive if you’re thinking in terms of causal diagramming in the first place, because part of the power of DAGs is incorporating the temporal sequence of events in visual terms.)
This matters because real-world causal systems often unfold in time, with early changes shaping later dynamics. Snapshots can obscure these dynamics.
Enter Naftali Weinberger (h/t Julia Rohrer), rejoining Dash’s critique (“Intervening and Letting Go: On the Adequacy of Equilibrium Causal Models,” Erkenntnis (2023) 88:2467–2491).
In a recent podcast, Naftali mentions (minute 32-3) that our model of the brain used to be as a computer, chunking tasks; but around the time computers got really good at chess, we noticed they were still really bad at walking. So we thought, maybe we have the wrong model, and need a dynamical system instead.
Where Dash problematized equilibrium causal models in light of the non-commutativity of manipulating and equilibrating, Weinberger groks the problem but remains optimistic about their practical use. They can still be adequate for causal inference, he argues, if we understand which feedback loops must remain untouched to preserve system stability. He stresses the importance of “letting go” of control over some variables to avoid disrupting homeostatic feedbacks (which he reconceptualizes as allostatic in a forthcoming paper).
The contribution here is recognizing that intervention design also means knowing when not to intervene: Understanding which variables not to intervene upon is as crucial as knowing which ones to target, as some interventions may destabilize systems by breaking these feedback mechanisms.
But the point seems, to me, more philosophical than applied. Equilibrium effects still break DAGs. We still need a better way of modeling them. In many complex systems of practical importance, our choices in where or how to intervene are limited. Typically, we can make some but not all of them in the context of strategic behavior and information effects. In many cases, the intervention has already happened and we’re left to model the aftermath, with feedbacks already reinforcing or unraveling the system.
Recognizing what DAGs don’t do underscores the need for alternative modeling approaches that can accommodate feedback loops and dynamic interactions inherent in complex systems. This isn’t just a technical matter, but affects how we think about policy analysis. If we know we can’t quite do what we need to do to know what effects interventions will have, that affects the ethics and practical wisdom of doing them.
(Thanks to Naftali for sharing his knowledge and time!)