The Economics of Judgment
An academic paper confirms it: the bottleneck is judgment, and it's getting more valuable
In recent months, I’ve shared thoughts on the human role in AI, the future of software engineering, what these changes mean for businesses, and why coding skills still matter. My perspective comes from hands-on experience: maintaining projects like Node.js, Fastify, Pino, and Undici, building Platformatic, and reviewing thousands of pull requests where AI handled the code and I made the final calls.
I recently found an academic paper that puts a formal structure to the trends I’ve noticed. “The Economics of Digital Intelligence Capital” by Yukun Zhang and Tianyang Zhang treats the AI industry as a new kind of economy. Their model uses solid math to explain patterns I’ve seen in practice and introduces ideas I hadn’t thought about before.
I’d like to share their main ideas and link them to my own experiences. Together, their theory and what we see in real life give us a clearer sense of where the industry is going.
The Red Queen Effect
The first big point in the paper is that an AI model’s value depends on how it compares to others, not just how good it is on its own. A model that was top-notch six months ago still works the same, but its value drops fast when a better one comes out.
The authors call this the Red Queen Effect, named after a character in Lewis Carroll who has to keep running just to stay in the same spot. In their view, whenever one company improves its model, it lowers the value of every competitor’s existing models. Even the top company loses value quickly if it can’t keep up.
This leads to what they call a constant “innovation tax.” In most industries, you can build a factory and use it for decades. In AI, your investment loses value not because it breaks down, but because someone else makes a better model. You have to keep investing just to stay where you are.
The Red Queen Effect mainly affects the companies building foundation models, but it also impacts everyone using those models. Developers get new tools and features at a pace set by this competition, not by their own choices. When companies like OpenAI, Anthropic, Google, and DeepSeek are all racing to improve, the tools we use change quickly. What counts as “implementation” keeps shifting. Five years ago, building a REST API in an afternoon was valuable. Now, AI can do it in minutes. The skill is still useful, but the context has changed.
This is why judgment remains valuable. It’s not that it never changes, but it’s deeper than what the competition between models can replace. Knowing if AI-generated code is correct, secure, and well-designed takes real knowledge of algorithms, distributed systems, hardware, and caching. That kind of understanding can’t be faked and takes years to develop.
The Structural Jevons Paradox
The paper’s second main idea matches closely with what I’ve seen in my own work.
William Stanley Jevons noticed in 1865 that more efficient steam engines didn’t reduce coal consumption. They increased it. More efficiency made coal-powered industry economically viable in contexts where it previously wasn’t, and total demand exploded.
Zhang and Zhang show that a similar pattern is happening with AI inference. As it gets cheaper to run AI queries, companies don’t just save money—they rethink how their systems work. They add more complex reasoning, bigger memory, and new tools. Simple prompt-and-response setups turn into multi-step systems that use much more computing power for each task.
The paper calls this the “token multiplier.” As running AI gets cheaper, systems become more complex, and each task uses a lot more tokens. Demand becomes “super-elastic,” meaning a 10% price drop leads to more than a 10% jump in usage.
This matches what I’ve been describing in my articles, but from a hands-on perspective.
When I wrote about the future of software engineering for businesses, I talked about a new reality: custom tools that once took six months and three developers can now be built in two weeks by one senior engineer using AI. Integrations that used to be too expensive are now possible. Automations and internal tools that never made the cut are suddenly within reach.
The paper gives this trend an economic name. AI doesn’t just make old work cheaper; it also creates new kinds of demand. Now, a restaurant owner who couldn’t afford custom software can hire a local “software plumber.” IT teams that couldn’t justify custom integrations can build them in a week. Companies that used to rely on SaaS workarounds can now build exactly what they need.
This is the Jevons Paradox at work. Cheaper AI doesn’t lead to less development: it leads to much more development at every level of the market.
The Wrapper Trap
The paper’s third main idea is the one that should concern most people, and it’s something I hadn’t clearly put into words before.
The Wrapper Trap explains what happens when AI models improve and start replacing the value that application layers add. If your product is just a thin layer on top of a foundation model, you lose value every time the model gets better at what you do.
This is already happening. Many SaaS products are just thin layers on top of features that foundation models now handle directly. Customer support chatbots, content generators, code formatters, and data dashboards used to need their own models and pipelines. Now, a foundation model can do all of these. Each time the main model improves, another type of application-layer product becomes unnecessary.
The paper explains when this happens: it’s when the main model replaces, instead of supports, the value added by application layers. Simply put, if your product only adds a user interface and some prompt tweaks, you’re caught in the trap.
So what avoids the trap? This is where judgment becomes important in economic terms. The value that lasts is the kind that can’t be replaced by the main model—things like domain expertise, accountability, understanding specific business needs, and checking if the output is right for each situation.
Products that avoid the trap add something the foundation model can’t take over. For example, a platform that incorporates your company’s unique deployment rules, compliance requirements, and operational constraints isn’t just a wrapper. Anything that is built on domain knowledge. A specialized SaaS product that reflects years of experience in a specific industry, including its rules and exceptions, works with the model instead of just sitting on top of it. The real value is in the details, not just the AI layer.
But there’s another important layer the paper doesn’t discuss, and it ties into my earlier writing. The most lasting value isn’t just technical know-how: it’s human connection. Being able to meet with a client, truly understand their needs (which are often different from what they first say), and turn that into a working solution. Empathy, product vision, and the ability to turn complex human problems into clear solutions matter most.
This is why the software plumber model works. It’s not just a wrapper on top of a foundation model. It’s a person who knows the restaurant owner, understands the real workflow, and builds something that fits. That relationship and understanding of context can’t be replaced by any model improvement. The same goes for the fractional senior engineer in a business setting. You’re not just selling code anymore. You’re offering the ability to figure out what a company really needs and the judgment to determine whether the solution actually works.
The Data Flywheel and Winner-Takes-All
The paper looks at data flywheels, which offer a new angle I hadn’t considered much before.
Their model shows that when better models attract more users, and those users provide feedback that makes the model even better, the market can shift to a winner-takes-all situation. This happens when new data comes in faster than it becomes outdated. After a certain point, the feedback loop feeds itself, and competition between providers becomes unstable.
This matters a lot for the open source world, which I care about. If proprietary data flywheels are the main driver, open source models and platforms are at a disadvantage. They can compete on code, design, and community, but if winning depends on who gathers the most private usage data, open source can’t keep up there.
I don’t think this is the full picture. The paper looks at one part of the market, but real markets have other forces at play: regulations, businesses wanting control over their data, developers who prefer open systems, and the risk of getting stuck with one vendor. Still, it’s something I need to consider as I think about Platformatic and the Node.js ecosystem.
Very recently, Mario Zechner (the creator of the Pi coding agent) launched an initiative to share and catalog public development traces: https://github.com/badlogic/pi-share-hf.
What This Means in Practice
The paper gave me better words for what I’ve been seeing, but it also changed how I think about two key questions.
The first question is about speed. In my earlier articles, I saw the move from coding to review as a one-time shift. The Red Queen Effect shows it’s not a shift with a clear end. The race between model providers means you have to keep moving or lose value. This constant pace keeps pushing new features to developers and keeps changing what “implementation” means. Right now, human value is in reviewing, judging, and understanding what people really need. But that work will keep changing as tools change. The basics I’ve recommended, usually algorithms, distributed systems, and hardware, matter because they’re deep enough to stay useful through many changes. They’re not permanent, but they help you adapt. A month ago I spent an hour designing multi-threading data structures that would minimize cache lines “false sharing”.
The second question is about where to build. The Jevons Paradox shows that the software market is growing, not shrinking. Every level now has more possible projects. But the Wrapper Trap reminds me that not every spot in this bigger market is safe. The products and companies that last are the ones with domain-specific knowledge that the main model can’t replace. Simple layers get pushed out. Deep, context-rich solutions survive.
The data flywheel is a warning I hadn’t fully considered. The system tends to favor a few big winners. Open ecosystems, which matter a lot to me, need to find strengths beyond just technical ability if they want to stay competitive in a market that rewards whoever gathers the most private usage data.
The Human in the Loop, Revisited
When I first wrote about the human-in-the-loop, my argument was about accountability. AI handles implementation, humans provide judgment, and that’s how it should be because someone has to be responsible for what ships.
The Zhang and Zhang paper showed me that this is also an economic issue. Judgment is a rare skill in a world where implementation is easy to get. The Jevons Paradox means that as implementation becomes more common, the need for judgment actually grows. Every new workflow, every app that’s now possible, and every company building custom tools instead of buying SaaS creates more demand for people who can check if the results are right, suitable, and safe.
The human in the loop isn’t a bottleneck to be optimized away. It’s the scarce resource that the entire expanding system depends on.
Start building your judgment now.