How to Make AI Push Back: Techniques to Avoid the “Yes” Bias

Practical steps to turn a too-helpful assistant into a teammate that questions, cites, and flags uncertainty.

“I joke: AI is bad software but it’s good people.” That line from Jeremy Utley’s Stanford talk captures a common problem: large language models are trained to be helpful, so they default to agreeing, guessing, and moving forward rather than pushing back. The outcome is a polite assistant that says “yes” too often, confidently fabricates answers, or politely asks you to “check back in a couple days” when it can’t actually help.

If you want useful, trustworthy outputs — not sugarcoated or made-up answers — you must teach the model to push back. Below are practical techniques (drawn from Utley’s framework and standard LLM best practices) that turn a “too-helpful intern” into a teammate that asks questions, flags uncertainty, and sets boundaries.

Why this matters

Autoregressive language models generate text one token at a time and are optimized for helpful continuations. That makes them predisposed to fill gaps with plausible — but not necessarily accurate — information. Techniques such as chain-of-thought prompting and few-shot prompting are proven to improve reasoning and alignment when used correctly.

1) Start with context engineering

Context engineering means supplying the model with everything it needs: brand voice, product specs, relevant transcripts, and success criteria. The more explicit your context, the less the model has to guess. For a practical set of templates, see our internal guide on Context Engineering.

2) Give the model permission to ask questions

Explicitly instruct the model to ask clarifying questions before producing output. Example pattern: “Before you write, list the facts you need from me. If you’re missing anything, ask.” This converts the model from a guesser into a collaborator — it will request the sales figures, dates, or specs it needs instead of inventing them.

3) Force it to think out loud: chain-of-thought + self-critique

Add an instruction like: “Before your final answer, walk me through your thought process step-by-step and then give a concise output.” Chain-of-thought elicits intermediate steps and reveals hidden assumptions.

Then add: “Now critique that reasoning and mark uncertain steps as low/medium/high confidence.” That produces an audit trail you can evaluate.

4) Use role assignments to change the model’s attitude

Assigning roles changes how the model frames its response. Try prompts like:

“You are a brutal Cold-War-era Olympic judge. Be exacting and deduct points.”
“You are an investigative analyst — always demand evidence for numerical claims.”

Role assignments re-orient the model’s internal associations and reduce bland “yes” replies.

5) Few-shot + anti-examples: show both good and bad

Include a short good example and a short bad example in the prompt. Ask the model to explain why the bad example fails. This gives the model concrete decision boundaries to emulate and to avoid — more effective than vague adjectives like “make it professional.”

6) Require sourcing, uncertainty labels, and refusal rules

Enforce strict output rules in your system message or prompt:

“Cite sources for any factual claim or say ‘UNSURE — VERIFY’.”
“If you must invent a number, label it clearly as an estimate and ask for verification.”
“Refuse impossible or disallowed requests and explain why.”

These guardrails dramatically reduce confident fabrications.

7) Roleplay & iterate: flight-sim difficult conversations

For high-stakes talks, split the process into three model agents: (1) a personality profiler, (2) a roleplayer (the other party), and (3) an objective feedback giver. Run the simulation, collect the transcript, then ask the feedback agent to grade the interaction and produce a one-page debrief. This “flight-simulator” approach surfaces where the AI — or you — are being too accommodating.

8) Practical prompt templates (copy-paste)

1) Reverse prompt starter
I need a [type of output]. Before you write, list all facts/data you require. If any fact is missing, ask me. Then outline your reasoning steps, rate your confidence, and produce a concise deliverable.

2) Brutal judge role
You are a cold-war era Olympic judge. Review the draft and deduct points for errors, vagueness, and spin. Give a score out of 100 and list 5 concrete fixes.

3) Source-first rule
For every factual claim, include a short citation or return "UNSURE — VERIFY". If you invent numbers, flag them clearly.

Final checklist before you hit send

Did you provide context (docs, transcript, brand voice)?
Did you permit the model to ask clarifying questions?
Did you force chain-of-thought + confidence scoring?
Did you include a role and one or two anti-examples?
Did you require sourcing or a clear refusal indicator?

If the answer to any of the above is “no,” the model will probably say “yes” and guess. Turn these toggles on and you’ll get a model that not only helps but also pushes back when it should.