AI & Law Stuff

#19 AI Snitches, strategic hallucinations, and op-ed BTS

May 22, 2026

Your Agent Takes the Stand

The release of OpenClaw in January spurred a new bout of interest in the notion of a personal agent that can be embedded in your own system and your own routines. Although this has slightly lost steam recently, and it will probably take a lot of ups and downs for such agents to go mainstream, it’s safe to bet that the future will include an ever-greater number of people trusting personal AI systems with most details of their life, in one way or another.

What will these agents do ? Well, to their proponents, they will tend to become a second you, able to act, speak, answer, and think in your stead, to take care of things you would prefer not to bother with. Some will also like their agents to be somewhat public-facing, to work as their own personal spokesperson, up to a point - ideal, for some - where a lot of “interpersonal” connections will take place between agents, with no need to keep any human in the loop.

Now, an agent that can know everything about you and can answer questions about you, your character, and your actions - surely that won’t ever be an issue, right ?

We discussed last week the wisdom of not taking notes of everything, every time. We also predicted that LLM logs will serve as a wonderful archive of mens rea for prosecutors. Meanwhile, people are trusting AI chatbots with ever greater control over their behaviour, to the extent that “the AI made me do it” might soon become a (rather inefficient, I hope) exculpatory defence.

And so, you read it first: one day some smart prosecutor is going to put a personal agent on the stand.1

But the real point here is that it will be particularly easy to make the agent talk. The Guardian had a good piece recently on people who specialise in jailbreaking models, pointing out that “[t]o test the safety and security of AI, hackers have to trick large language models into breaking their own rules. It requires ingenuity and manipulation”. And a playbook that, upon reading, looks very much like some cross-examinations; models, trained on human language, often react like humans (or at least pretend to) when put in human-like situations.

That’s one of their strengths, but also a weakness. Radley Balko at The Watch reports (via) about the experiments of a researcher who elicited a false confession from ChatGPT, using:

the Reid technique, the confrontational interrogation method first developed in the 1950s that has since been adopted by police departments all over the country. The man for whom it’s named, John Reid, published his methodology after winning acclaim for getting a man named Darrel Parker to confess to raping and murdering his own wife. The Reid technique works at getting confessions. But it’s less successful at getting accurate ones.

Expect agents to snitch on you - even, maybe, when you did nothing wrong.

Strategic Hallucinations

When querying me about the hallucination database, people frequently ask how many cases slipped under the radar. And while I try to be exhaustive, beyond the many referrals I receive (with a great many thanks !), I am bounded by the availability of data and my own proactiveness in scouring for new cases.2 Some cases also can’t be counted due to cultural/legal differences in how they are being handled.3 And finally, there are a lot of hallucinations that no one ever notices, because they bear on unimportant or moot legal arguments, or because no one took the care to check.

But more recently I realised there is another possible category of missing hallucinations: those that are noticed, but left unmentioned, because the party noticing them may want to keep this argument up their sleeve at some point.

I don’t think it is this common. Calling out hallucinations on the other party’s part is a one-shot gun, and you probably prefer to be the one clocking it instead of the court or your opponent coming clean. But I still think that’s a non-zero category, and I am looking forward to an appeal stage focusing on that one alleged flaw from the first instance that fortuitously surfaced only when it was too late.

But to push the question further, how else can hallucinations be strategically deployed ?

In their basic form - i.e., fake or misrepresented authorities adduced to support a legal argument4 - it’s unlikely the cost-benefit of deliberately peppering a brief with hallucinations will ever work out. Sure, you save time and effort, and obtain a text that on the face of it looks extremely well-supported - something that, given most people’s reliance on superficial heuristics, might achieve results until you get caught. Or you might want to flood the other party with strong-looking submissions to force through a settlement, hopefully before anyone checks that there is any substance to your legal position. But once you are caught, any advantage evaporates, and you are left worse off with your credibility in tatters.

(I am not talking here of the ethics of generating legal submissions riddled with hallucinations, of course don’t do that.)

Alternatively, I see a few ways to leverage hallucinations strategically (though not always realistically):

You can try to make the other side hallucinate. If you know the other side is using AI, you can shape the inputs they feed it (large, confusingly-drafted filings, made-up-sounding but real citations, near-miss case names) so their model is more likely to confabulate when it summarizes or responds. Going further, we should eventually expect some prompt injections in legal briefs designed to trip up an opponent’s use of AI or contribute to their failure to properly prosecute a case.
You can allege that real but obscure authorities from the other side were hallucinated to cast doubt on their competence, especially where the citation is from a jurisdiction, database, tribunal, or grey literature the court does not immediately recognise. This is using the ambient fear of hallucinations as a litigation weapon. (I have seen some examples of this already, but it works only if you have some credibility to begin with.)
You may opt for some hallucinations in non-court facing documents, to instill a false sense of security/superiority in your opponent. “Look at this chump who forgot to remove the prompt, we’ll make breakfast out of them in court” could be something you might want to elicit in your opponent.
Finally, parties may eventually fall into a tu quoque / mutually-assured-destruction equilibrium: once both sides are heavy AI users, and once both quietly suspect their own filings wouldn’t survive a forensic audit, calling out the other party’s hallucination becomes an invitation to have your own briefs combed through in retaliation. The result is a tacit non-aggression pact, where no one has an incentive to throw the first stone.

I am not recommending any of this. I am simply noting that the incentives exist, the technology is there, and the people are, professionally speaking, paid to be imaginative. Draw your own conclusions.

More, not fewer lawyers

Last Sunday the Washington Post published an op-ed of mine that made the point that I expect the number of lawyers to continuously inch up over the next few years, far from the predictions of collapse made here and there.

This being an op-ed, you can trust most of the original draft was eventually edited out for space - which is perfectly understandable, but means I can also expand here on Substack, the columns of which are much more expandable.5 In particular, it could be useful to lay out some of the “meta” arguments behind this opinion piece, at least to record my thinking process in this respect.

To start with, this piece made a prediction: that the numbers of lawyers will continue broadly to increase. Now, I don’t claim to be good at predicting the future, because I know most people are terrible at it, which means, presumably, me too. Philip Tetlock’s research on superforecasters on this subject impressed me at a young age. But one thing I got from it was that a good strategy when peering into the future is to start by expecting things to … continue mostly as they have. In other words, a base-rate-steady-as-baseline approach to things often works better than sophisticated models as a starting point.

(Certainly, stark discontinuities exist too, black-swan events, etc. ! But they are not the majority of cases and one may prefer to bet on the more common scenario of things being boring.)

Another teaching from Tetlock is that, although experts are terrible at forecasting, their failure in this respect makes next to zero difference to their status. Cue the few people who “predicted” a given financial or economic crisis, and since then keep getting invited to opine on their views of the future - even if these views never come to pass. This is rather dispiriting and liable to turn anyone into a cynic, but at least, you know, this also means predicting is mostly upside, little downside. (Not that I do not care about being wrong - I care tremendously - but that helps relativise it.)

Likewise, there is only upside here because the argument is made ceteris paribus: if World War III, Covid-27, or the Rapture happens, decreasing bar applications will be the least of our worries. And ditto, in fact, if AI kills or disempowers us all by then.

And so, my point is that I did not opt to stake out a position that’s particularly far-fetched, even though I have seen it rarely spelled out before. By contrast, the opposing view - that lawyers, on their own, or especially so, or together with other white-collar professions - will be wiped out is (or at least was - the past few weeks saw some shift in this regard) rather salient in the commentariat. I am naturally contrarian, but it becomes even easier when the contrary position is the one that makes the most sense to me and is, at least on second thought, the easiest to adopt.

All this is not to say that I am not convinced on the merits by the arguments in the op-ed ; I very much am. And there are more of them, left out courtesy of the editing process. I am thinking, e.g., about the fact that lawyers and judicial authorities will likely, albeit regrettably, add frictions that will benefit (human) lawyers to the detriment of AI (lawyers). I’ll also eventually write a piece on the “lump of law fallacy”, the idea that if we automate some legal efforts, this will free time or availability - whereas, in all likelihood, this will just serve as a new baseline for ever more norms.

But at least I have now put my expectations in writing, under my name and with a date on it. So let’s see, a few years out, whether it ages well. (Assuming, of course, the Rapture hasn't sorted out the question for us first.)

To some extent this is very artificial: “memories” for an agent will likely be stored in a way that can be retrieved by anyone, and it’s these memories that would contain a confession. But, you know, anything goes to sway a jury.

For what it’s worth, I recently found a research article that did its own tally of the cases for roughly the first half of 2025, and out of around 90 cases the database was missing four or five.

For instance, this recent French newspaper article suggests many French cases are handled at the bar level, in the lawyer-to-lawyer grievance process - and not by having a decision on the record in a particular matter.

Which takes us closer to pure forgery or fabrication than “hallucination”, but let’s not get caught up in that.

Though I try not to abuse my readers’ patience by keeping posts around the 2k-words limit.

Artificial Authority

Discussion about this post

Ready for more?