AI & Law Stuff #1
Unread texts, unheard errors, bad actors
(This is the first of a – hopefully – weekly newsletter analysing recent developments in the field.)
Hypergraphia Meets Autocomplete
A lot of professions, mine included, revolve around what Bourdieu, long ago, described as “manipulating symbols”, i.e., producing, arranging, and legitimising meaning through language, categories, and signs rather than through direct material transformation. Sociologist Musa al-Gharbi has recently coined the term “symbolic capitalists” to describe this category of workers, and persuasively stated that, if you are reading this, you are probably one of them.
Now, a key way to manipulate symbols is by producing a text, often relying on other texts, to then catalyse actions that will, in all likelihood, be embodied in further texts. Many industries, when seen from afar, are just the manifestations of institutional hypergraphia: writing text after text that, to some extent, no one will ever read. Years ago, a study found that a staggering third of reports put online by the World Bank were never even downloaded. Even recently, the UN admitted doubts over whether its – many – reports were ever read or served any purpose.
There used to be one exception: at least, and by definition, the person writing these texts used to have read it (or at least the part of it that person wrote). But of course, this is no longer a given.
All this to introduce the fact that, while this newsletter is focused on legal AI and data, many further types of textual output suffer from the same issues I recorded in the database: lack of verification resulting in hallucinations.
And indeed, we saw it a few weeks ago with headlines reporting that Deloitte, a consultancy, had issued reports that were full of hallucinations. In the same vein, we had this incident where a Norwegian municipality adopted a local order full of hallucinated references, something that came to light when local journalists obtained the record of conversations between the municipal officer and ChatGPT (as recounted here). And of course, the continuing crisis of (higher) education, where everything to be written digitally is written by ChatGPT.
I expect that the headlines will continue to come out with respect to many different types of text outputs, such as:
Audit reports;
Policy briefs and white papers;
ESG disclosures;
Grant applications;
Tender documents;
DEI action plans;
Strategic roadmaps;
Academic literature;
Regulatory filings;
Medical summaries;
Etc.
Note that I am not ringing the alarm or anything here. As with the legal domain, the benefits of AI likely compensate for the accidents in terms of hallucinations; and one may wonder whether a hallucinated citation falling in a report no one reads makes any sound at all.
Duty of Care, Duty to Read
It’s hard to predict anything, especially the future, but one thing I can say with certainty is that 2026 might be the year when LLM providers will truly feel the heavy hand of the law in terms of liability for their outputs (beyond the IP issues), be that through regulatory action (hey Grok !), or through lawsuits concluding, some years downstream of LLMs entering the global scene.
I am not keeping track of all these legal actions (this Substack does, somewhat), and I have in general a certain sympathy for LLM providers and for the argument that users should be responsible for what they do with models, but of course a lot of these cases will depend on the facts and the applicable laws.
And in this sense, here is one that recently concluded in China, as reported by King & Wood Mallesons:
In December 2025, the Hangzhou Internet Court held that the defendant, a generative AI service provider (the "Defendant"), was not liable for generating AI "hallucinations," finding that the Defendant had fulfilled its reasonable duty of care, such as applying the common technological measures widely used in the AI industry to enhance the accuracy of its AI-generated content and also reminding its users that the AI-generated content might not be accurate.
This is an interesting case not only because the AI system produced hallucinations that could have misled the plaintiff (a scenario likely different from hallucinations sounding in libel or slander, as has happened in other jurisdiction), but because the AI also wrongly stated that the plaintiff could receive compensation for such lapses.
The court reportedly declined to enforce that latter promise, providing a contrast with the Canadian case where an airline AI chatbot’s erroneous advice had been given legal force. (Though one key distinction here is the role of an AI system as a company’s mouthpiece.)
Further of interest is the court’s approach in terms of duty of care, disclosure, and capacity to follow “industry standards” – for which it found, on all counts, in favour of the LLM provider. This strikes me as a sensible approach, although – as any approach based on standards and notions of proportionality – it offers a lot of leeway for judges, who are rarely tech-minded, to find that the provider has not complied with its duties.
Broken Windows, Broken Citations
In academic circles, the issue of generative AI has from the outset often been conflated with the question of plagiarism. This made a lot of sense: in both cases we are dealing with the examination of texts, and universities have developed these policies (and used tools) for years designed to catch plagiarism – surely they could take inspiration from this to deal with AI.
I have often thought that this was a mistake, be it only because, for a long time (and maybe still now), the tools professing to detect AI writing were terribly miscalibrated, resulting in many false positives that tech-averse institutions would act upon to the detriment of the students. I have seen many students wrongly accused of using AI to generate content, something all the more infuriating since I fail to see the issue with using AI to draft, as long as it is done responsibly.
But conflating plagiarism and AI is not necessarily wrong in some respects, when one thinks of the type of actors that resort to the first, or fail to use the second responsibly.
Courtesy of the Economist, this superb study by John Liu, Wenwei Peng and Shaoda Wang, where they find:
Applying advanced plagiarism-detection algorithms to half a million publicly available graduate dissertations in China, we uncover hidden misconduct and validate it against incentivized measures of honesty. Linking plagiarism records to rich administrative data, we document four main findings. First, plagiarism is pervasive and predicts adverse political selection: dishonest individuals are more likely to enter and advance in the public sector. Second, dishonest individuals perform worse when holding power: focusing on the judiciary and exploiting quasi-random case assignments, we find that judges with plagiarism histories issue more preferential rulings and attract a greater number of appeals — effects partly mitigated by trial livestreaming.
When describing the hallucinations database, I often note the substantial minority of “bad actors”: people who filed briefs with hallucinated materials, not because they made a mistake or were unaware of an LLM’s propensity to hallucinate, but because they were reckless and did not care – vexatious litigants or sloppy lawyers. And one positive thing about spotting hallucinations (like spotting plagiarism, if we tried to do it) is to put the spotlight on these bad actors.
This matters all the more in light of the study’s further finding that:
Third, dishonesty spills over across judges and between judges and lawyers.
Call it the broken window theory of hallucinations: a disregard for the truthfulness of a text leads to more issues down the way, including a disregard for even reading text – a potential catastrophe for a field, the law, whose legitimacy relies in part on textual chains of authority.
Moreover, all this takes a new light with the author’s findings that “among colleagues with identical seniority, individuals who plagiarized their dissertations advanced 9% more rapidly in the first five years of their careers”. Even if this result is limited to the public sector, it raises a more general concern: that selection for dishonesty may be far more prevalent than we tend to assume, and that AI may amplify rather than merely reveal it.
Tracking these questions – arising from the use of AI and the forms of authority it may erode – is exactly what we will be doing here every week.


Hey! Great article!
Since this is your first one, I just want to say that I appreciate your initiative in sharing your insights.
I don't remember how I found your Substack, but I'll keep following it (from Brazil 🙂).
Cheers!
Great line: “one may wonder whether a hallucinated citation falling in a report no one reads makes any sound at all.” As a litigator very concerned about the use of hallucinated content in court filings it seems that until the tech no longer makes things up, this will continue no matter the extent to which Court’s sanction the conduct.