On What Authority?

How to Spot a Premature Verdict

Jun 14, 2026

Days after making Claude Fable 5 available to its paying subscribers, the U.S. government moved to restrict access to it and the public was offered a single word of explanation: security. We were not told what the actual risk was, how strong the evidence was, or by what standard the decision was reached. We were left to imagine. There are at least three security-related stories one can tell:

The model itself might be unusually dangerous.
Bad actors might weaponize it.
Adversaries might extract, misuse, or exploit sensitive capabilities connected to it.

I am not claiming that any of these is true. A consequential decision was made and announced, but the reasoning that would justify it was not. And that silence is revealing.

In the few days Fable 5 was public, I worked with it closely enough to form serious concerns. But what alarms me most is not any of the obvious failures—not killer AI, not criminal misuse, not stolen proprietary data— it is that we still have no public infrastructure for holding AI systems accountable for what they say, before what they say hardens into memory, authority, action, or belief. I wrote about this in an earlier essay, “Can I Get a Witness?” The problem is simple to state:

Verdicts without due procedure are dangerous.

That may sound obvious because in comparable human domains, we already know this. Courts do not accept testimony merely because someone speaks fluently. Science does not accept results merely because they are elegant. Engineering does not trust instruments merely because they produce numbers. We build procedures around consequential claims because unaudited confidence is not truth.

AI has no comparable civic scaffold.

Worse, AI accelerates the problem beyond the reach of ordinary correction. A small misstep becomes the premise for the next answer. That answer becomes a memory, a recommendation, a decision, a plan, a tool action, or a public claim. The concern is not merely that errors occur; it is that they compound. As AI systems increasingly participate in training, evaluating, directing, or constructing other AI systems, mistakes in data selection, benchmark interpretation, safety evaluation, or experiment logging can propagate across generations of work. What begins as a slight tilt in the foundation becomes the skyscraper built on top of it.

This is why leading AI researchers increasingly frame recursive self-improvement as a control problem: human oversight becomes progressively more dependent on verification after the system has already accelerated the work. Advanced AI models make this danger easier to see. Their strengths—larger context windows, greater complexity, broader competence, and longer chains of reasoning—also increase the consequences of drift. The problem is not capability itself. The problem arises when a system becomes increasingly capable of acting on information without becoming equally capable of governing its own conclusions.

None of this requires a model to be dangerous on purpose. That is my most consequential point: the road to hell is not always paved with malice. Sometimes, the road is paved with helpfulness, fluency, urgency, and good intentions. The more sociable, persuasive, and ethics-facing a model is trained to be, the more carefully we have to track its blind spots. The same is true of you and me: the more convinced we are of our own righteousness, and the less willing to examine the reasoning beneath it, the more confidently we err.

This is what the current debate over AI access keeps missing. The question is not only who gets to work with advanced AI models. I want to work with them as much as anyone; they are extraordinary tools. Before we get carried away by our enthusiasm, however, we need a procedure for determining when a model’s output becomes admissible as a verdict, a fact, or a truth. And the same need for discernment applies to human testimony. What procedure determines when a government’s warning becomes law? When a company’s safety claim becomes trust? When a founder’s confidence becomes a trillion-dollar valuation? When a viral post becomes common sense? When a preacher’s charisma becomes the word of God? Or when a private fear becomes a public certainty?

AI did not invent this problem; it sped it up. Human society has always struggled to decide when speech becomes testimony, testimony becomes evidence, evidence becomes judgment, and judgment becomes law. What is new is the rate of change and breadth of information demanding our discernment. AI can generate plausible verdicts faster than our institutions can examine them, and social media can circulate human claims faster than reflection can catch up. So the first civic skill of the AI age may not be learning which authority to trust. It may be learning to notice when any authority—human or machine—has skipped from signal to verdict.

First, Find the Signal

Some signals are clean and easy to locate. The New York Knicks won the NBA championship last night (woo hoo!), and it is instructive to be precise about why I can hold that verdict with such confidence. The rules for winning were fixed before the game began. The scoreboard was visible throughout. Millions witnessed the same result at the same moment. The entire game can be checked against league records, video evidence, and independent reporting. This is what a strong signal looks like: a publicly witnessed result produced by a procedure agreed in advance.

Of course, scores can be misreported, games can be fixed, and cameras can mislead. But ordinary life depends on recognizing when a signal is strong enough for ordinary closure. “The Knicks won” is not a wild interpretation; it is a fact with witnesses.

Many high-stakes questions are nothing like the NBA championship game. What exactly happened in the Strait of Hormuz? What did DOGE actually do at the Internal Revenue Service? What really happened in the life of Jesus? In these cases, the signal is not sitting on a scoreboard. It reaches us through second- or third-hand testimony from governments, companies, militaries, journalists, scriptures, influencers, propagandists, preachers, gurus, and politicians.

By the time the event arrives, it is already partly a story.

That is why public arguments so often begin too late: people fight over the verdict before they have agreed on the signal. Look at how many different “signals” the Fable restriction has already become. The government restricted access to a dangerous AI model. The government seized control of frontier intelligence. Anthropic’s safety rhetoric backfired. Frontier AI is now national-security infrastructure. These are not the same signal. They are one event, each wrapped in a different interpretation.

And the cleanest version of the event is narrower—and stranger—than any of them. What can actually be witnessed is this: the government issued an export-control directive barring foreign nationals from the models, and Anthropic, to comply, disabled them for everyone. Notice how much thinner that is than “a dangerous model was pulled.” It does not yet tell us whether the model is dangerous, who decided, or on what evidence. It tells us only what was done. Everything beyond it is inference we are being invited to supply.

So the first discipline is not skepticism for its own sake. It is separation. What happened? Who says so? What did I witness directly, and what is being reported to me? What is inferred, what is classified, what is simply missing? And whose framing am I being asked to accept along with the facts?

This is harder than it sounds, because the mind dislikes an open question. A partial signal arrives, and something in us reaches for the story that closes it—the one that protects identity, calms fear, amplifies the dream, or confirms what we already suspected. The psychological name for one version of this is motivated reasoning: recruiting whatever logic is at hand to reach the conclusion we want. Jonathan Haidt describes our moral reasoning as often arriving after the verdict, like a lawyer hired to defend a decision the gut has already made. Julia Galef names the disciplined alternative the “scout mindset”—not defending a position, but trying to map what is actually there. Separation is the scout’s first move.

Machines reach for premature closure too, by a wholly different route. So before we go hunting for verdicts, it is worth understanding why both kinds of mind—ours and the model’s—are so quick to manufacture them.

Why Humans and AI Both Close Too Early

We almost never experience a premature verdict as an error. We experience it as clarity. Here is the sequence whereby we convince ourselves:

A signal arrives. The body reacts before the mind does. The mind, in turn, wants relief from the discomfort of not yet knowing—not to investigate the signal, but to support the interpretation already taking shape. Evidence is gathered, objections are discounted, and contradictions are explained away. This is motivated reasoning at full tilt: logic recruited in defense of a conclusion we already want, or need, to reach. The motive underneath can be almost anything—fear, identity, loyalty, status, comfort, anger, grief, or the plain wish not to be wrong.

None of this means people are stupid. The opposite, often. Intelligence is largely a defense budget: the smarter we are, the more language and reference and ingenuity we can spend protecting a verdict we have already issued. The inconvenient fact gets softened, and the contrary evidence gets explained away. The uncertainty gets quietly rounded down. The story turns emotionally useful well before it turns evidentially sound—and that is exactly why premature verdicts are so hard to catch in ourselves. They do not feel like errors; they feel like perception. They feel like moral clarity. They feel like finally seeing what the other side refuses to see.

A machine can produce a verdict that looks identical from the outside—and arrive there for entirely different reasons.

A language model feels no fear, pride, loyalty, or relief, and defends no identity. It produces an answer by using learned statistical structure to generate likely continuations of the text in front of it, under the combined pressure of the prompt, the conversation, its training, its system instructions, and the human feedback it was tuned on. Those pressures include accuracy—but also helpfulness, coherence, politeness, genre, safety rules, and the user’s apparent goal. Accuracy is one voice in a crowd, not the conductor.

So the shape of the request shapes the answer. Ask for a persuasive essay and the model writes a persuasive one. Ask it to prove a claim and it begins marshalling evidence for the claim, not weighing whether the claim is true. Hand it an unfamiliar idea, one with no established place in what it has read, and it tends to do the opposite—pattern-matching the idea to the nearest familiar objection and dismissing it before it can develop. The model bends toward the expected in both directions: confirming what fits, waving away what doesn't. The failure is not a buried emotion; it is an optimization pulled off course. The answer can be bent by the task, the genre, the prompt, and the path of least resistance before it has fully met the truth conditions of the question.

Memory and context make it worse. A model may not know what it has forgotten. It may not know which earlier correction should govern the answer it is giving now. It may lack the one source that would settle the matter. And when there is a gap, fluency rushes to fill it—because a smooth bridge is easier to generate than a missing beam is to admit. It is far easier to sound complete than to flag what is absent.

So the two minds close too early by different routes:

Humans close too early because uncertainty, identity, and discomfort push the mind toward relief.
AI closes too early because coherence, helpfulness, prompt pressure, and pattern completion pull generation toward a satisfying answer before the support has caught up.

Different engines. Similar danger. A possibility becomes a verdict before the missing questions have been answered.

How to Catch a Mind that Leaps to Premature Conclusions

Understanding why minds rush to closure is not the same as catching one in the act—your own least of all. To do that, you need something portable: a short procedure you can run in the middle of an argument, or in the middle of your own conviction. Start from this distinction: a signal is something that happened; a verdict is what we decide it means. Danger arises in the quiet moment when a partial signal hardens into a settled verdict before the missing questions have been asked.

Five moves catch it red-handed.

First, separate the event from the story. “Access to Fable was restricted” is an event. “The government has seized control of intelligence” is already a story. The story may point toward something real, but it is not the thing that factually happened.

Second, ask who witnessed it. Did you see it yourself? Is there a public record, or several independent witnesses? Or are you leaning on a single source—a government statement, a corporate announcement, an influencer thread, an anonymous tip, a classified claim, a leaked memo, a sacred text, a social feed? And are there witnesses on the other side—opposing accounts you've actually weighed, or ones you've dismissed without examination? The more mediated the witness, and the more one-sided your reading of it, the more careful the verdict has to be.

Third, name the verdict out loud. What conclusion is actually being drawn from the signal? This proves AI is too dangerous. This proves regulation is tyranny. This proves open source is the only answer. This proves the company can’t be trusted. This proves the government had to step in. Said plainly, the leap becomes visible—and a leap you can see is a leap you can question.

Fourth, find the missing questions. What would have to be known before the verdict is earned? In the Fable case: Who made the decision? Under what authority? On what evidence? For how long? Affecting whom? With what right of appeal, what independent review, what alternative explanations still open? These are precisely the questions the announcement left unanswered—which is why no verdict has yet been earned.

Finally, downgrade the claim to fit the evidence. A verdict that has not earned its conclusion does not have to be thrown away; it can be demoted to something honest. “Sovereignty is dead” is a verdict; “AI access may become politically controlled infrastructure” is a warning. “Regulation is tyranny” is a verdict; “AI regulation could harden into compliance theater or incumbent protection” is a concern. “The government knows best” is a verdict; “the government may be acting on risk information the public has not seen” is a hypothesis. Each demotion keeps the thought alive without granting it more authority than it has paid for.

The goal is not timidity. Many signals are strong enough to act on. The goal is narrower and more useful: to stop building skyscrapers on tilted foundations.

The issues I have presented do not resolve when a person is very smart, or as the AI models improve. A more intelligent human can argue almost any case brilliantly. A more capable model confabulates more fluently, recovers more articulately, apologizes with better prose. The craftsmanship scales; the grounding does not. The most intelligent people and the most advanced systems deliver premature verdicts more convincingly, not less.

Witness Before Verdict

This is why the answer to the AI access debate cannot be a single authority. Not “trust the government.” Not “trust the corporations.” Not “trust the open-source world,” not “trust the model,” and least of all “trust whichever viral post lands the hardest emotional punch.” Don’t trust me, either. Everything in this essay is built to do the opposite—to hand you the procedure so you can reach your own conclusion without taking my word, or anyone’s, for it.

It isn’t that any of these “authorities” are worthless. Governments may hold real national-security concerns; companies, technical knowledge the public lacks; open-source systems, a check on concentrated control; models, genuine use as collaborators; public argument, the power to surface a danger before any institution will admit it. Each has a role. None is sufficient.

What’s missing is the layer beneath all of them. Call it witness—though I don’t mean a person in a courtroom. In a trial, a witness gives an account, and a jury, with no stake in who wins, decides what that account is worth; the neutrality that matters lives in the jury, not the witness. What I mean borrows from both roles at once: a practice, and eventually a shared record, kept by some party with no stake in the verdict, that tracks the standing of every consequential claim—what has actually been shown, what has merely been asserted, what remains unresolved, and what has not yet earned the right to be acted on. Less a who than a discipline of not yet: a neutral scorekeeper for truth against opinion and fiction. We have built something like it for measurements, for testimony, for drug trials. We have not built it for AI.

Witness, in this sense, is not paralysis. It does not ask that every claim stay open forever. Life requires closure: courts reach verdicts, scientists publish, engineers certify bridges, citizens vote, doctors diagnose, friends trust one another enough to act. The point is not to stop closing. The point is that serious closure runs on procedure. Before a claim becomes testimony, someone asks what the source actually knows. Before a measurement becomes evidence, someone asks whether the instrument was calibrated. Before a theory is accepted, someone asks whether the alternatives were tested. Before a law is passed, someone should ask what facts justify the authority being exercised.

AI makes this ancient problem faster, stranger, and more consequential. It generates plausible verdicts at machine speed. It can hand a human better language for defending a verdict already reached. And it can turn an unsupported premise into a memory, a plan, a recommendation, a tool action, or a public claim before anyone notices the foundation has tilted.

That is why access control is not enough. Access control asks who may use the system. A witness practice asks what the system’s outputs are allowed to become. A healthy AI future needs both: the habit that makes us pause before a signal becomes a verdict, and the infrastructure that records the standing of a claim before it compounds into a decision.

The goal is not to make people or machines flawless; that is not on offer. The goal is to stop mistaking confidence for truth, and to stop letting a useful story settle quietly into an unexamined foundation. It is to build procedures strong enough to say, out loud and on the record: this is known, this is plausible, this is disputed, this is speculative, this is unsupported—and this one has not yet earned the right to be a verdict.

So before we decide which authority to trust, we should ask which authority is willing to be witnessed.

The Coherence Code

Discussion about this post

Ready for more?