The Problem With "Show Your Work" When AI Can Show It For You
The Viral AI Infographic Gets Process Grading Wrong — Here's What Actually Works
This week, Khanmigo was declared dead. And yet, infographics like this are still going viral, telling students to use AI the same way Khanmigo failed.
I'm leaving out the name of the person who prompted it because this isn't about calling people out, but I am surprised in 2026 almost nobody is questioning this advice:
OK, not all of it is bad. Here’s the only part I agree with:
Don’t ban AI, teach thinking.
But the rest deserves scrutiny , including, I'll admit upfront, some previously proposed solutions.
The Socratic prompt problem
I wrote about the issues with the current Socratic AI method last year, the day study mode was released. And there are plenty of students running into the same issue.
Many students I have spoken to say AI Socratic methods feel frustrating because they default to an "explanation-feedback" loop rather than genuine, tailored questioning. They can feel slow, sometimes condescending, and evasive, turning a quick question into a tedious exercise. When students just need a direct answer, being forced into a back-and-forth conversation wastes time and kills momentum.
To be fair, a meta-analysis of 51 studies (Wang & Fan, 2025) suggested ChatGPT does improve learning performance ( but the gains are concentrated in surface-level outcomes). The effect on higher-order thinking is much weaker, and it depends heavily on how AI is used. AI tutors have many problems (and Dan Meyer wrote about it this week too). Khanmigo didn’t fail because the technology wasn’t good enough, but rather because students didn’t engage with it in the way the model assumed they would.
Even with massive funding, integration, and visibility, usage stayed low, interactions were shallow (“IDK IDK”), and the system had to become increasingly intrusive just to get attention.
And it’s the exact same assumption this viral infographic is making.
So why is this a questionable infographic?
1. It puts AI in charge of the reasoning sequence
"Act as my expert Algebra 1 tutor…"
The student is no longer deciding:
-Wait, how do I approach this?
-What strategy should I try first?
The AI now controls the sequence, pacing, and logic. That’s cognitive offloading at the process level (not just the answer level).
Wharton researchers Shaw & Nave (2026) gave this a name: cognitive surrender. Across nearly 10,000 reasoning trials, they attempted to show that simply having AI available caused people to stop thinking independently more than half the time — even when the AI gave wrong answers. The students didn't just use AI as a tool. They also handed it the wheel.
2. “Never give me the answer”..sure, but withholding answers isn't the same as building understanding.
While this sounds pedagogically smart, but it’s actually shallow.
Students can still:
-Make random guesses
-Pattern-match or follow breadcrumbs without understanding
So it just prevents answer-giving (which is only going to frustrate the student), but doesn't require real thinking. Withholding answers is not the same as somehow enforcing cognition.
A 2024 experiment with 600 students across 10 universities (Niloy et al.) proposed that ChatGPT use caused a significant drop in originality and accuracy in creative writing (even while structure and presentation improved. The lesson: AI can make students look better on the surface while hollowing out the thinking underneath. And a 2025 RCT (Barcaui) attempted to reinforce exactly this: students who studied with ChatGPT scored 11 percentage points lower on a surprise retention test 45 days later than those who studied traditionally. Barcaui calls this "borrowed competence" .. students confuse the AI's fluency with their own understanding.
3. It creates dependency and trains students to wait for external prompts
“Ask me a guiding question… wait for my response”
This trains students to:
-Wait for prompts
-Rely on external nudges
Instead of:
-Generating their own next step
-Monitoring their own thinking
That’s metacognitive offloading. MIT researchers proposed the brain does less when AI does the thinking. And the effect didn’t reverse immediately when AI was taken away.
They’re not learning how to think , they’re learning how to be led through thinking.
And here’s the bigger picture that should concern all of us: when everyone uses the same AI tutor running the same Socratic script, we risk everyone converging on the same reasoning patterns. Research from USC published in Trends in Cognitive Sciences (Sourati et al., 2025/2026) found that LLMs systematically flatten cognitive diversity by mirroring dominant patterns in their training data. At scale, a prompt like this isn’t just ineffective. It’s homogenizing.
The third problematic part of the infographic:
The accountability problem:
“Require students to submit their AI chat link”
“Grade the conversation”
While many students can't easily copy and paste their chat history ( and let’s be real reviewing it is extremely tedious for instructors) this also creates the illusion that if only we can see the process, we can assess thinking!
In reality requiring students to submit AI chat logs and grading the conversation has an obvious weakness: conversations can be retrofitted, and AI can generate the appearance of a thinking process on request. Visibility into process isn't the same as evidence of thinking. This is a real limitation, and I don't have a clean solution to it.
It actually shifts offloading one layer deeper
Instead of:
-“Give me the answer”
Students can now offload:
-“Give me the process”
That’s metacognitive offloading, they’re outsourcing: how to think and how to show thinking
So what actually works?
In ongoing work with Jason Gulya and Nick Potkalitsky , we’ve been challenging the idea that “reflection” is enough. What’s emerging instead is a shift toward metacognition checkpoints as infrastructure (something that governs the entire interaction with AI), not something layered on after the fact.
That shift came out of cross-disciplinary work, where the same problem shows up differently in writing, science, and humanities — but the failure mode is identical.
Not outsourcing the structure of thinking to AI, but building it first, then using AI to sharpen thinking.
Many of you are already familiar with my UnBlooms™ framework:
The stronger AI gets, the more important it becomes to know when to push back, when to create with it, and when to resist it entirely ( not to avoid AI, but to stay in control of your own thinking) .And I have some proposed metacognitive checkpoints to do this. They have recently been peer reviewed.






