The Evolution of Bloom's Taxonomy, And Where It Was Always Heading in the Age of AI
What I mean by "Don't Grade the Output. Grade the Thinking" its not just process based learning.
I get asked about UnBlooms™ a lot lately.
But before I begin I wonder if anyone really knows bloom’s?
Most educators can name Bloom’s taxonomy. Almost none know its actual history.
Benjamin Bloom published his original framework in 1956 — but it didn’t emerge from just him. It came out of a series of conferences held between 1948 and 1953, designed to find common vocabulary for curriculum design and educational examinations. The committee was chaired by Professor Bloom but included four collaborators: Max Englehart, Edward Furst, Walter Hill, and David Krathwohl.
What they produced was a classification of educational objectives — not a pyramid, not a hierarchy with “Create” at the top. That version didn’t even include creation as a category.
What Bloom actually emphasized was evaluation: the capacity to judge, assess, and critique.
The pyramid that now appears in every syllabus? That was a 2001 revision — introduced by Lorin Anderson (one of Bloom's former students) and Krathwohl (one of his original collaborators).
And what most people don’t realize: that revision didn’t just update the language. It flipped the top two levels. In the original framework, Evaluation sat above Synthesis. Anderson and Krathwohl swapped them, converted the categories from nouns to verbs, and installed “Create” at the summit.
this is the colorful version we know now:
Bloom spent his career pushing back on how his work had been repackaged. It bothered him.
He was right to be bothered.
(oh, I can relate)
What the pyramid got wrong
There’s something almost poetic about the fact that the most widely cited framework in education has been misrepresented for over 20 years. Teachers put it in syllabi. Administrators reference it in professional development sessions. Curriculum designers build entire units around it.
And most of them are working from the wrong version.
The original taxonomy wasn’t a ladder you climbed rung by rung. It was more evaluative, more recursive — closer to what good thinking actually looks like. The idea that you had to memorize before you could analyze, that you had to understand before you could create ( that was a later interpretation). A tidy one. A wrong one.
It’s worth noting that Bloom’s team always intended the cognitive taxonomy to be part of a series. They planned to follow it with frameworks for affective and psychomotor learning. The pyramid that emerged had very little space for emotion, perception, or the messy, embodied reality of how people actually learn.
Researchers had been chipping away at Bloom’s limitations long before AI arrived. In 2013, L. Dee Fink proposed a model where different kinds of learning — foundational knowledge, application, reflection, caring — could happen in any order, feeding off each other simultaneously. No hierarchy. No ladder. It was closer to the truth than Bloom’s, but it still didn’t account for what happens when a machine can do the generating for you. Jason Gulya has made some great comments on this in our upcoming chapter together.
Then colleagues showed up and argued we need to reverse it entirely..
Then AI showed up and made it impossible to ignore
Here’s the thing: generative AI didn’t create the gap between learning process and academic product. That gap was always there, AI just made it impossible to ignore.
A student today can open any AI tool and skip straight to “Create.” They can generate a polished essay, a research summary, a lab report — without touching a single lower level of the taxonomy. If the goal was always to reach the top, AI just handed everyone an elevator.
Then colleagues showed up and argued we need to reverse it:
Such as the inverted blooms model by my colleague Michelle Kassorla Start with Create, work backward through evaluation and analysis to build genuine understanding. It's a smart fix for writing-heavy courses.
However in our experience inversion still assumes a linear sequence, and it doesn't travel well outside text-based work. A student cannot use AI to run an enzyme assay or conduct a lab experiment. In those settings, starting with a polished output and working backward doesn't map onto how knowledge is actually built.
ButUnBlooms™ doesn't reject Bloom's outright — it builds on it while going beyond it.
So what are we actually measuring now?
From a question to a framework
Sometimes the hardest part of designing anything is figuring out what problem you're actually trying to solve. Donald Schön called this "problem setting" , and argued it's harder and more important than problem solving. I'd never read him when I built UnBlooms. But I put that question at the center of the loop anyway:
What is the problem you are trying to solve?
Then I put a rough version of my UnBlooms™ chart in front of educators at a Microsoft Teachers event in Chicago in 2023, then at a Microsoft AFT event in DC. At the time it was more of a provocation than a framework — I kept asking the same question in those rooms: if AI can already Create, Evaluate, and Analyze, what exactly are we asking students to do?
Nobody had a clean answer. The responses I got told me the question mattered.
Over the next two years, I developed UnBlooms™ through iterative practice-based research involving over 900 educators across university and K–12 settings — through workshops, faculty conversations, classroom observations, and direct student work. I presented the full framework in July then UnBlooms™ was presented at the AI in Education conference at Oxford University in September 2025 where it was published on Zenodo with a full abstract and citation record
The core insight was this: learning isn’t a pyramid. It never was. It’s a loop from the top and spiral from the side— recursive, nonlinear, and context-dependent. A student doesn’t need to memorize before they can critique. They can enter the cycle at any point. What matters is that they’re doing the thinking , not outsourcing it.
At the center of UnBlooms™ is something no pyramid ever had room for: human agency and curiosity — not as an outcome to be measured at the end, but as the organizing principle of the entire framework. Everything else exists to protect and develop that. The metacognitive reflections, the reflective footnotes, the productive friction through Socratic tutoring, the process transparency — these aren't add-ons. They're how we know a human is actually thinking, not just submitting. In a post-AI assessment world, the question is never just what did you produce. It's how deliberately did you think along the way.
Since presenting at Oxford and OpenAI last year, it has had over 2,000 views and downloads
What it actually looks like
UnBlooms™ has three integrated components. First, a Critical Evaluation Scale and, metacognitive checkpoints that move students from surface-level error detection all the way to systemic critique of AI outputs. Second, a reflective decision tree that helps educators ask: does AI meaningfully advance understanding here, or does it automate curiosity? Third, discipline-specific implementation guides, because what this looks like in a biology lab is genuinely different from what it looks like in a humanities classroom — but the underlying framework is the same. If you want to go deeper, all three are detailed in the workbook.
At Level 5 students may choose to work without AI because the struggle itself is the learning goal. Resist — is the one people find most surprising. It might also be the most important.
A question I get asked lately: is it still worth learning to do something AI can simulate?"
That's the exact question UnBlooms™ Resist level answers. It was peer reviewed at Norway conference last month where I also discussed how it relates to the gradual release model. I'm still not sure the Resist level translates well outside Western academic contexts
Thank you Anna Mills for the question
April 2026 — Independent convergence Anthropic's "Getting Good at Claude" describes a "Discernment Spiral" concept , named and described in this Substack March 11, 2026, six weeks prior. To visualize the timeline, I wrote it in another post.
What replaces “higher order” is:
The depth of your metacognitive reflection — are you aware of how you’re thinking?
The quality of your critique of AI outputs — are you catching assumptions, bias, missing voices?
The intentionality of your choices — are you deciding when to use AI and when not to?
What replaces “lower order” is:
Passive acceptance of AI output
Surface-level error spotting
Following instructions without questioning them
I built this without institutional support, from the ground up, while most places were still debating whether to ban AI entirely.
At Oxford I was told immediately to trademark it. So I did.
The trademark is a timestamp. It says this came from somewhere specific. If you use these ideas, just say where they came from. That’s all.
Why it’s spreading
The framework is being used in writing classes, biology labs, humanities courses, and national security conversations. I’ve presented it at symposiums with colleagues across disciplines. Another peer-reviewed paper is forthcoming.
I gift copies of my book when I’m invited to present.
The goal was never to lock it down, or hide it behind a paywall. I want to share it as widely as possible so we can get early feedback on our paper with Jason and Nick Potkalitsky
Don’t grade the output. Grade the thinking. But build the infrastructure first.
How does it look like in practice?
AI broke the illusion that a finished product was ever enough evidence of learning.
That is why UnBlooms™ does not ask only whether students can create, evaluate, or analyze. AI can already simulate all three.
It asks harder questions:
What did you notice?
What did you question?
What did you accept, reject, or revise?
And when did you decide that using AI would get in the way of learning?
Don’t grade the output. Grade the thinking.
But build the infrastructure first.
Everything I've described here works differently depending on your discipline, your students, and your context. That's exactly why I built the workshops — to work through it together in real settings. If you want to bring that conversation to your institution or conference, here's where to start
This is why I don’t see UnBlooms™ as anti-Bloom.
In some ways, UnBlooms™ may be more faithful to Bloom’s original vision than the pyramid ever was. We are circling back to him, to what he actually said before it got simplified, reordered, and turned into a poster on every classroom wall.
Bloom’s taxonomy always emphasized evaluation — the ability to judge, assess, and critique. The pyramid buried that instinct under a hierarchy that AI has now made obsolete.
UnBlooms™ brings judgment back to the center. Where it always belonged.










A peer-reviewed paper — "Using Learning Analytics to Measure AI-Critical Literacy: A Within-Subjects Study of the UnBlooms™ Framework" — was presented at LAK'26 in Bergen, Norway in April 2026. A second peer-reviewed paper with Jason Gulya and Nick is forthcoming
The practice section is where this lands. The questions you end with — what did you notice, what did you question, what did you accept or revise — are doing something Bloom never did: making the thinking process the object of assessment, not just the product.
What strikes me is how much that reorients the teacher's job. You're not evaluating whether students climbed the pyramid. You're evaluating their judgment in real time. That's a harder skill to develop and a harder thing to assess, but it's the right thing.
The infrastructure point at the end is the one I'd push on. Building the conditions for that kind of metacognitive transparency at scale is where most institutions will stall, because the system wasn't designed to reward that kind of thinking in the first place.