Encryption Survives. Anonymity Doesn’t.
Chat Control heads to its final trilogue June 29. The mandatory-scanning version my paper models is out of the text — for now. Here's why it still matters.
I just hit submit on my security working paper after a bit more coding and estimating and rewriting. The current version lives on GitHub. Naturally, the week I let it go is the week Brussels schedules its Chat Control endgame.
Here’s where things stand.
The State of Play
Chat Control 1.0 — a voluntary regime that let providers hash-match against known child sexual abuse material (CSAM) on services unencrypted to them — is dead. Parliament let the temporary derogation lapse on April 3 (I wrote about that here). Several of the big providers announced they’d keep scanning voluntarily anyway, which tells you something about how much “legal basis” was ever doing the work. People really want to protect kids, and not letting abuse material live on your servers might seem like a no-brainer to that end.
Chat Control 2.0 — a permanent framework — is very much alive, and it’s heading to its fifth and final trilogue on June 29, with adoption expected in July. Where 1.0 was voluntary, known-hash, and unencrypted-only, 2.0’s maximal version flips every axis: mandatory, unknown content (AI-classified, not hash-matched), and everyone — reaching even into encrypted services. So if you’ve been waiting for the part where this actually gets decided: it’s now.
The headline is that the current Council text walks two of those axes back: it dropped the explicitly mandatory detection orders, and the explicit scanning of end-to-end-encrypted messages. Civil liberties win, right?
Sort-of.
Two quieter ways to lose your privacy
The text traded one loud mechanism for two quiet ones.
First, risk-mitigation obligations. Providers would have to assess how their services could be misused and mitigate the risks. Critics argue that, at scale, the only way to mitigate is to scan. There is just too much information flow to not automate it. Hash-matching known material (PhotoDNA) is the easy, defensible part — but “mitigate the risk” doesn’t stop at known hashes; in practice, once the infrastructure is in place, the worry is that it pushes toward classifying unknown CSAM and other illegal content, which is exactly the version my paper says backfires. So mandatory scanning walks out the front door and climbs back in through the compliance window.
Second, mandatory age verification, to use private (and thus perhaps encrypted) messaging. Which may mean: before you send an encrypted message, you first prove who you are. The encryption survives. The anonymity doesn’t. For a whistleblower, a journalist’s source, a survivor, a teenager, that distinction is the whole game. (But: anonymity shields offenders, too!)
So the version of Chat Control that my paper actually models — mandatory AI classification of unknown CSAM across everyone’s messages — isn’t in today’s text. But “isn’t in today’s text” doesn’t mean “off the table.” Mandatory language can be reintroduced in revision. Member states keep implementation discretion. And the math doesn’t care which presidency holds the pen.
Why I Submitted Anyway
Because I really want to get back to doing science.
But also, the argument is structural, not topical. Can we pull off scanning for known CSAM to advance child safety? Yes. We just need more investigative resources, and probably public awareness campaigns would help.
Can we pull off scanning for unknown CSAM to advance child safety? No. Too many false positives in spite of high accuracy, dedicated attackers going dark, casual offenders being too stupid to be deterred, and resource exhaustion doom such a program to massively backfire, undermining the very child safety it’s meant to protect.
The four pathways (classification, strategy, information, and resource reallocation) converge on net harm for the unknown mandatory-scanning version regardless of how this month’s wording lands. So the paper is a tool for whenever that version resurfaces. And it will. It’s been a zombie proposal for years (being killed only to come back next cycle); the Council wants it back; it will be back.
Quick recap: hash-matching against known CSAM is potentially resource-tractable, because you’re checking against a confirmed list and a flag can be verified rather than merely inferred. Mass classification of unknown content is not. This is because, under rarity, the sheer volume of false positives swamps the true positives — no matter how good the classifier is. Disambiguating false from true positives then is too hard, resource-intensive, and imperfect.
This is counter-intuitive to Techno-solutionists! But it is just Bayes. Well, that, plus a little causal revolution sauce to be sure no other pathways net compensate for the stupendous Bayesian backfire.
In my analysis, dedicated offenders dominate harm-weighted outcomes ~46 to 1, and the false positive flood overwhelms investigative capacity by more than 2x before a single specialist sees a case. (Full argument with proof and simulation in the repo.)
Still No Constituency
Privacy activists don’t love that I argue known-hash scanning is statistically and legally defensible. Child-safety advocates don’t love that I argue mandatory scanning for unknown CSAM would undermine child safety. “Known: yes, unknown: no” has no fan club. I continue to find this more funny than frustrating.
What I’m Asking For
Insiders: if you have data on how any of this has actually operated — volumes, false positive rates, workflow — I’d love to see it.
Critics: if my assumptions are off, tell me how. I’d rather be fixed now than wrong in print.
Collaborators: the framework applies equally to financial, medical, and educational screenings, among others. If you work in the field and see the connection, let’s talk.
Paper: GitHub.


