Humans in the loop won't fix AI hallucinations. Here's why.

AI hallucinations, the tendency of large language models to generate confident, plausible-sounding falsehoods, are not a bug that will eventually be patched. They are a structural feature of how these tools work. Judging by the almost weekly, news stories of lawyers or consultants being caught out by a false case citation submitted to court or a fictional footnote in a client report, most people haven't found a robust way to govern this or prevent hallucinations finding their way into end products. Understanding why requires abandoning the idea that better training or a “human in the loop” will one day solve the problem entirely.

Before I go on, I note that some AI commentators hate the word “hallucinations”. They say it anthropomorphizes the model or implies some conscious intent, or that it is used politically to fuel the mythology of the model and make errors sound poetic, rather than simply just wrong. I'm going to use the word “hallucinations” here anyway, since it is the commonly understood terminology, but I acknowledge the objections to the word and hope we, one day, find a better term.

Language models do not retrieve facts from a reliable store of knowledge. When they generate text, they do so by predicting what words are likely to follow other words, based on patterns learned from vast amount of human writing. This means they can produce fluent, authoritative prose about things which simply are not true, and they do not have a mechanism built in to know the difference. They are optimized to be plausible, not factual. A model which fabricated a court case does not know its own uncertainty. Calibrating that uncertainty better is possible, and research is ongoing, but eliminating it entirely would require a different system entirely. For now, and for the foreseeable future, hallucination is the price of the fluency that makes these tools useful.

The Myth of the Human in the Loop

The standard response to this problem is to keep a “human in the loop”. If AI hallucinates, the logic goes, a qualified person should review every output before it is acted upon. This is sensible in principle, but often inadequate in practice.

“Human in the loop” specifically means a person who reviews, approves or corrects AI output at each decision point before anything consequential happens. This is different to other forms of human oversight, such as auditing outputs after the fact, setting policy guardrails in advance or monitoring aggregate performance over time. Post-hoc audit can catch systemic problems, but in some contexts may not undo specific harms. Policy level oversight can reduce risk, but does not verify individual outputs. Human in the loop review is the only form of control that directly addresses the accuracy of an AI-generated claim before it does damage. It's also the most expensive, slowest and least scalable type of control.

In many real world deployments, human in the loop isn't feasible. The volume of output is too high, the cost of qualified reviewers is too great, or the speed required by the use case makes synchronous human review impossible. Legal AI, for example, may process 1000s of pages in the time it would take a lawyer to read one. The business case for this technology often relies on the technology's speed, which means human review that would make it entirely safe may be the thing which is being traded away.

Confirmation Bias

Even where human review does occur, there is a deeper problem. Reviewers are not neutral. People tend to accept information that confirms what they already believe or expect, and to scrutinize information that contradicts it. When a lawyer asks an AI system to research supporting case law, the system returns a list of plausible citations and the time for the lawyer, maybe over-worked and under pressure, to check each case and quotation runs against their cognitive grain. This is confirmation bias at work, a feature of human cognition which operates below the level of conscious awareness. Where the error is consistent with what the human reviewer expected to find, the human review will be less accurate.

Professional Negligence

Professionals are held to a standard of “reasonable care” for their particular profession under professional negligence law and the question for the courts will be, what “reasonable” means when AI is routinely used. A number of lawyers have already been sanctioned by courts for submitting briefs which contained hallucinations. The courts said this was a breach of their professional conduct rules and potentially professional negligence. The lawyer, not the tool, is responsible for the accuracy of the filing. This is the right legal position, in my view, but it creates a difficult practical scenario. Where lawyers are forced by the market to roll out cheaper fees for clients on the back of AI adoption, volume of work per lawyer goes up as a result, but reviewing the output becomes operationally and cognitively challenging.

Critically, we do not yet know how the AI-assisted error rates compare to the baseline. How often did lawyers before AI cite cases they had not fully read, mischaracterized or miss relevant precedent. How often did doctors misdiagnose or misread test results? With baselines poorly documented and industries protecting the perception of their profession as flawless, the real impact of AI on professional output isn't properly understood. Is there a chance that we have actually become more accurate with AI?

What about the person who isn't a lawyer who can now represent themselves in court, but finds that their AI lawyer has misquoted a case? The litigant-in-person can't be held to be professionally negligent, but will the courts find the LLM to have acted as an unlicensed lawyer (whatever their terms might say about permissible use cases) and find a way to make them accountable, through negligence, contract, or some other source of liability (e.g. product liability law) for the wasted court time and additional costs?

Living with Errors

None of this means AI should not be used in professional contexts. The path forward requires honesty about the limits of human oversight. Consider:

using different AI to check AI outputs, “AI in the loop”;
use authoritative sources as your data source for fact-checking, e.g. check an LLM draft against a Lexis-Nexis database;
train your teams about cognitive biases and run tests to see how bias each of them is;
teach lawyers the new skillset of highly critical fact checking;
choosing where your output is the highest risk and focusing your human capital on reviewing this content in more detail;
design who and how many people take part in the review before content can be released; and
understanding the impact of effective review on cost models and retaining the experts you need to perform the review. Don't promise radically reduced fees without understand the cost of your human and other oversight. Leave time in the model for meaningful review to take place.

Subscribe to receive more articles like this here.

Search

Search

Humans in the loop won't fix AI hallucinations. Here's why.

Latest Insights

Cambridge Life Sciences Cluster continues to attract investment

When the crowd roars, counterfeiters cash in

Some reputations are more equal than others: GEORGE ORWELL and the limits of trade mark law