AI & Law Stuff #3
Academics, gym memberships, and, somehow, a frog
The academics are at it too
One thing that makes the use of AI in the legal profession and practice so alluring, but also so double-edged, is the background in which that use takes place. We have, concomitantly:
A profession that frequently claims to be overworked and overburdened, on account of deadlines, clients to please, ethical and deontological duties to navigate, etc. Nobody will cry for them, but it is a common refrain, especially at the junior levels;
A billing structure (and professional obligations) that incentives thoroughness to the point of absurdity; and
A distinct respect for the written word and the text,1 a singular belief that a turn of phrase will make the difference.
Seen under that lens, generative AI can seem a godsend, in the ability to produce ever more text. And while I can’t stress enough in this blog that there is nothing wrong here as long as it is done responsibly (and I expect every one will eventually reach a balance as to one’s use of these tools), this same impulse, this delegation of the writing duty, is at the root of many maladjustments of the legal profession with AI: the hypergraphia I mentioned in the first newsletter, the texts nobody reads in the second, and the hallucinations that is my (current) life project.
Well, who else (professes to be) overworked and overburdened, and have undue respect for the written word ? Where else do we see hypergraphia, unread(able) prose, and hallucinations ?
Seva Gunitsky at Hegemon reports from the front line of academia:
One thing that changed in that relatively brief time [as journal editor] is the sheer volume of manuscripts. The editor-in-chief emailed us last summer to warn that submissions were double or triple our typical averages. Many had little to do with the journal’s topic and instead focused on computer science or internet security. It seems people were using AI to generate terrible manuscripts and then shotgun-spraying them across the academy with little regard for quality or fit.
As a result, our desk reject rate rose to 75%. A desk reject is the first filter for academic journals, where the editor-in-chief determines which manuscripts should go out for peer review. Here, it served as an effective slop filter because the slop was easily recognizable. Our workload still increased, but only slightly.
A friend who is an editor at a prestige journal for international law recently shared the same experience with me, and you can easily find various headlines conveying the same feeling from, e.g., academic conference organisers.2
As for hallucinations, they are par for the course, with recent reports that papers submitted to prestigious AI conferences exhibit many mishaps in this respect:
After scanning 4841 papers accepted by the equally prestigious Conference on Neural Information Processing Systems (NeurIPS), we discovered 100s of hallucinated citations missed by the 3+ reviewers who evaluated each paper.
Crucially, the issue here goes deeper than the (laughable) examples once compiled by academ-ai,3 of papers that forgot to delete the enthusiastic “Certainly !” from ChatGPT. Instead, what we see are papers that, sometimes at least, in another time would have simply been an average output from an unimaginative academic that has to meet its publish or perish goals.
This is why Gunitsky is also right to insist that this is not necessarily slop in the sense we attribute to brainless, feed-ready content, and that this puts the spotlight on the notions of judgment and discernment. That there are now dozens of papers of middling quality making tiny points barely worth considering just bring home the point that, maybe, publishing this kind of paper has never been worth it.
In other words, the academic world (at least when it comes to non-STEM fields) is faced with the same question put to lawyers and jurists: for a profession centred around text and its production, what happens when text comes cheap ?
And just as for lawyers, the undeniable personal benefits of AI come with systemic consequences, of varying valence. Take this remarkable recent paper published in Nature:
Using a dataset of 41.3 million research papers across the natural sciences and covering distinct eras of AI, here we show an accelerated adoption of AI tools among scientists and consistent professional advantages associated with AI usage, but a collective narrowing of scientific focus. Scientists who engage in AI-augmented research publish 3.02 times more papers, receive 4.84 times more citations and become research project leaders 1.37 years earlier than those who do not. By contrast, AI adoption shrinks the collective volume of scientific topics studied by 4.63% and decreases scientists’ engagement with one another by 22%. By consequence, adoption of AI in science presents what seems to be a paradox: an expansion of individual scientists’ impact but a contraction in collective science’s reach, as AI-augmented work moves collectively towards areas richest in data. With reduced follow-on engagement, AI tools seem to automate established fields rather than explore new ones, highlighting a tension between personal advancement and collective scientific progress.
On this model, our possible future: better (AI-enhanced) lawyers, worse (or at least less imaginative) case law.
Flooding the Zone With Briefs
The systemic issues of course may range further than a lack of innovation in the output lawyers derive from generative AI.
I have long taught my students about what I call the “gym membership” model of the law: simply put, a large part of the legal system works because people don’t use it, very much like gyms are thought to operate on the wishful thinking of people signing up in January and never coming back. I claim no originality, and am influenced here by the classic POSIWID, and a long-lasting interest for the sociology of organisations, but if you squint, you can see that a lot of things can be described in line with this system: the welfare state, fractional reserve banking, and, of course, gym memberships.
And so, the question of “what happens when text becomes cheap” takes a distinct flavor for courts and tribunals faced with mountains of text and submissions which they are meant to delve into. It is particularly acute for these adjudicators dealing with self-represented litigants, or one-off actors, not bound by ethical rules (or the pragmatic demands of a repeat player game).
I had this on the back of my mind when reading David Timm’s report on “Gen-AI Misuse in Procurement Litigation”. In particular, the idea that:
procurement tribunals are already under pressure to resolve disputes quickly and have limited resources to do so. Brandolini’s Law says that energy spent to refute false claims is an order of magnitude higher than to create the falsehoods. This concept applies with equal force to frivolous bid protests and wasteful monetary appeals. A rapid increase in new filings may overwhelm the system with many flawed filings. While these are being resolved, many procurements are paused pending resolution. In every case, the Government, private parties, and tribunals will waste time and resources dealing with these filings. The long-term consequences are still emerging.
While the report focuses on misuse of AI, the remarks here can as well apply to the mere use of AI where no use existed prior; in other words, a system that already strains under normal caseload might break now that the costs of participation have dropped tremendously.
It’s still early to know if this will be the case, one possible answer - deployed here and here - will be to raise these participation costs by introducing friction. The requirement to retain a lawyer common to many civil law jurisdictions, for instance, can be understood as such a friction, as are many procedural rules that, when breached, result in a case being thrown out : expect those to gain in standing in the coming months and years.
Contextual leaks and amphibian hallucinations
If one were to retrace the (short) history of AI hype since the release of ChatGPT in November 2022, a few different phases could easily be identified, with the commentariat (in the form, e.g., of these awful LinkedIn or Twitter posts with rocket emojis) focusing, in turn, on the following points:
Prompt engineering;
Fine-tuning;
Retrieval-augmented-generation (“RAG”);
Model Context Protocol (“MCP”); and now
Agents.
As someone straddling the tech and legal world, it has been interesting to see how these concepts migrated from these fields, often with quite a bit of lag (lawyers I interact with have barely reached the “RAG” stage).
But if one takes a bird’s-eye view, most of these subjects of hype and discussion come down to the same basic intuition that one gets the best out of a model when one masters the context of a given input. A few weeks ago someone described this adroitly as “context plumbing”, and it is constantly in the back of my mind as I am designing a legal tech product: how do I make sure that the right context reaches a model so as to steer it towards an optimal output.
These musings to serve as preface to a recent example when context plumbing goes wrong, but I’ll let the local news bulletin describe it for me:
HEBER CITY, Utah — An artificial intelligence that writes police reports had some explaining to do earlier this month after it claimed a Heber City officer had shape-shifted into a frog.
However, the truth behind that so-called magical transformation is simple.
“The body cam software and the AI report writing software picked up on the movie that was playing in the background, which happened to be ‘The Princess and the Frog,’” Sgt. Keel told FOX 13 News. “That’s when we learned the importance of correcting these AI-generated reports.”
While this also serves as a reminder that when technology, including AI, fails, sometimes it does so in the dumbest way possible, it bring home the point about context plumbing: the data is rarely clean and well-structured, and your model does not care either way: but you do. Or at least, you should.
On this, I would recommend last week’s post from Aurélien at Trying to Understand the World.
These testimonials of course echo the (already two years old !) classic headline about this SF magazine ceasing to accept submissions, as they were submerged with AI-generated ideas.
A project that seems to have stalled and would deserve being picked up.


Brilliant. It makes me wonder, how does this "hypergraphia" concretely affect client outcomes in practice? Sounds like a real challange.
The way I see it, academics have an even harder road ahead of them. Law has a mostly closed canon, harsh potential penalties for fake cases, and strong incentive for opposing counsel to catch it. In academia, there’s a much wider world of competing theories and journals and institutions, less obvious enforcement of academic penalties for hallucinated citations, and less clear incentive for individual reviewers to spend the effort to verify citations compared to adversarial court system. I’m speaking in relative terms, it’s not that the dynamic is completely flipped.