Richard Hill

Judgement for AI-mediated work


Up tempo work

Generative AI has quietly changed the tempo of work. Not in the headline places. In the boring places. Email, agendas, briefing notes, drafts of policies, draft replies to customers, draft performance notes. Stuff that used to take just enough effort to force a pause. 

Now the pause is optional. That sounds like a productivity win. It is, sometimes. It’s also a governance problem wearing a productivity moustache.

Because when drafting becomes effortless, organisations start committing to things without noticing. The thing that used to be “a draft” becomes “the decision”, because it reads cleanly and moves fast. 

The claim

The biggest leadership risk in the AI era is not that AI will make leaders obsolete. It’s that AI will make commitment too cheap, and organisations will confuse fluent drafting with actual decision-making.

What would change my mind? Evidence that teams using AI heavily can consistently show (a) clear decision rights, (b) reliable escalation paths for exceptions, and (c) an audit trail that explains who owned what when it mattered, without slowing everything to a crawl. Not a policy. Actual practice.

What I think is going on

The current leadership narrative goes something like: AI can draft, but it can’t lead. Leaders must provide context, set guardrails, build trust, show judgement, and so on. 

All true, in the abstract. But it misses the mechanics.

AI doesn’t “replace leadership”. It changes the surface area of leadership. It pushes leadership into thousands of micro-moments, distributed across the organisation, where people are generating text and making commitments at speed. And those micro-moments are exactly where decision rights usually get fuzzy.

So the right question isn’t “Can AI lead?” The question is: Where are decisions being made by accident, because text became cheap?

The part people get wrong

A lot of writing about AI leadership leans on “guardrails (clear values and decision rights)” as if saying it makes it real. 

But guardrails are not values on a slide. Guardrails are a working control system:

  • which decisions exist (not “be responsible”, actual decisions)
  • who owns them
  • what counts as an exception
  • what must be escalated
  • what evidence is required before committing
  • how you find out when people bypass the route

If you can’t answer those in plain English, the “guardrails” are vibes. Vibes do not survive contact with the inbox.

A cleaner mental model

McKinsey frames a shift from “command” to “context”.  I mostly agree, but I’d sharpen it:

Leadership is moving from “deciding” to “designing decision conditions”.

That means your job is to design the conditions under which other people, often using AI, can make decent calls under time pressure, without turning the organisation into a liability farm.

Concrete example: customer support.

AI helps a support agent draft a reply in 30 seconds. The model is good at sounding helpful. It will often over-promise because over-promising sounds helpful. A human who’s tired, new, or keen to close tickets can hit send.

Now you’ve got an implied contract. Delivery teams inherit a mess. Finance gets dragged into refunds. Nobody can say whether this was an authorised exception or an accidental commitment.

The fix isn’t “tell agents to be careful”. The fix is to explicitly separate:

  • drafting authority (anyone can draft),
  • commitment authority (only named roles can approve terms, money, timelines, exceptions),
  • release control (what must be checked before “send”, and who checks it).

That’s governance. Not glamorous. Very effective.

Judgement is not a personality trait

The document says leaders must demonstrate judgement, aligning choices to values, because AI is advisory not authoritative. 

Yes. But “judgement” as a leadership trait is too squishy to operate at scale. I used to treat judgement as something you either have or you don’t. Now I think judgement is a system property as much as a personal one.

Judgement shows up in:

  • what evidence is required before acting
  • whether uncertainty is made visible or papered over
  • how exceptions are handled
  • whether reversals are allowed without punishment
  • whether you can trace a decision back to a person, a rationale, and a timestamp

If your operating environment rewards speed and punishes hesitation, you’ll get confident nonsense. AI just helps you generate it faster.

Creativity, but make it accountable

The piece argues leaders must design for nonlinear outcomes, not “20 percent better” but “10 times better”, and that humans must frame the problem, invite dissent, and hold the creative line. 

Again, broadly right. Here’s the catch: AI makes it easy to produce ten options, which can create the illusion of creativity while reducing actual thinking. You get a pile of plausible outputs and nobody wants to be the boring person who asks, “What are we optimising for?”

So I treat creativity work with AI like this:

  1. Write the brief like a contract. What is in scope, out of scope, what constraints are real, what success looks like, what failure looks like.
  2. Force one hard trade-off. Speed vs accuracy. Cost vs user harm. Personalisation vs privacy. Pick one. Make it explicit.
  3. Require a dissent paragraph. Not “risks”, a genuine counter-argument. If the best dissent you can write is weak, you probably don’t understand the space.
  4. Name the decision owner. The person who is on the hook when the shiny idea breaks.

That’s how you get novelty without random motion.

Where this breaks

A few objections are fair.

“This is too heavy for small teams.”

If you try to build a full enterprise control framework, yes. But decision rights can be lightweight. A one-page “commitment map” is often enough to stop the worst mistakes.

“We move too fast to add process.”

You’re already paying for process. You’re just paying after the fact, in rework, customer fallout, HR pain, and fire drills. The question is where you want to spend your admin budget: before or after damage.

“But leaders do need softer skills, trust, empathy, learning culture.”

Agreed. The document makes a strong case for learning loops like premortems and after-action reviews.  I’m not arguing against the human stuff. I’m arguing that the human stuff fails without mechanics. Trust doesn’t scale by declaration. It scales when people can predict how decisions get made and how exceptions are handled.

“AI tools can be configured to prevent this.”

Sometimes. But configuration is still a governance choice. Who decides the rules? Who can override? What gets logged? Same problem, new wrapper.

What I’d do if I were responsible

  • Map “commitment moments”. Where can someone, with a draft, create an obligation? Email, proposals, HR notes, customer replies, invoices, policy statements, procurement requests.
  • Define three decision classes.
    1. routine, can be auto-approved
    2. exception, needs named sign-off
    3. high-stakes, needs a second human and a record
  • Write a “draft vs decision” rule into workflows. Not training slides. Actual steps. If it matters, it gets reviewed.
  • Require minimal decision logs for exceptions. Two minutes, not a dissertation: what changed, why, who approved, what evidence, what you’ll check later.
  • Run one premortem per month on an AI-assisted process. “Assume this went wrong. How?” Then fix the top two failure modes. 
  • Protect leadership attention for inflection points. The piece mentions a CEO keeping 20 percent of the calendar empty.  The principle is right: protect time for the moments where judgement actually sits.

Close

I’m watching one thing more than anything else: whether organisations can keep the speed benefits of AI while making commitments harder to do by accident.

Because that’s the new baseline. Drafting is cheap. Accountability is not. If you don’t design for that, your “AI transformation” will mostly be an expensive way to manufacture confident errors faster.

Anyway.

 

Richard Hill

Judgement for AI-mediated work

© 2026 Richard Hill