The Replacement Rate

Two guys in the jungle. A tiger charges. One kneels to tighten his shoelaces. The other yells: "You can't outrun a tiger!" First guy: "I don't have to outrun the tiger. I only have to outrun you."

Thorsten Ball used this joke recently to make a point about AI and the average software engineer. The joke is more precise than he may have intended. It contains, in five sentences, both a correct economic model and a game-theoretic trap. The model: your value isn't absolute; it's relative to the next-best alternative. The trap: when everyone tightens their shoes, the tiger catches someone anyway, and the race never ends.

Sports analytics formalised this intuition decades ago. The framework is called VORP: Value Over Replacement Player.

The Sabermetric Model⚓︎

VORP was developed by Keith Woolner at Baseball Prospectus in the late 1990s to solve a problem that sounds deceptively simple: how do you compare the value of a catcher to a shortstop to a designated hitter? Raw statistics don't account for position, playing time, or the cost of the alternative.

The replacement level is the foundation. It represents the performance you'd get from a freely available substitute: a minor-league call-up, a waiver claim, a bench player signed for the league minimum. Not the average player, but the marginal one, the player your team could acquire tomorrow at negligible cost.

For a hitter, the math is straightforward:

VORP = (Player's value per PA − Replacement value per PA) × Plate Appearances

Value per plate appearance is calculated using linear weights: each batting event assigned a run value based on its average contribution to scoring, derived from decades of play-by-play data. A single is worth roughly +0.70 runs. A home run, +1.65. A walk, +0.55. An out, zero. The replacement baseline sits at approximately 20 runs below league average per 600 plate appearances, or roughly −0.033 runs per PA.

A concrete example. A hitter generating +0.05 runs per PA above average, across 500 plate appearances, against a replacement level of −0.033 runs per PA:

Rate difference: 0.05 − (−0.033) = 0.083 runs per PA

VORP: 0.083 × 500 ≈ 41.5 runs above replacement

At the standard conversion of roughly 10 runs per win, that's about 4.15 wins above what a replacement player would have contributed. Barry Bonds in 2001 posted a VORP of 145.1. The average player sits near zero. The metric captures something intuitive in a formal way: value is not how good you are in isolation. Value is how much better you are than the alternative someone could get for free.

Three properties of this model matter for what follows.

First, VORP is a marginal metric. It doesn't ask "how good are you?" It asks "how much better are you than the alternative I could get for free?" An excellent player on a team with equally excellent alternatives has lower VORP than that same player on a team where the alternative is a minor-leaguer. Value is relative to the replacement available, not absolute.

Second, VORP scales with playing time. A brilliant player who plays 50 games contributes less VORP than a good player who plays 150. Volume matters alongside rate. You have to show up.

Third, and this is the property that matters most: in baseball, the replacement level is roughly stable. Year to year, the talent pool at AAA, the depth of available minor-league call-ups, doesn't shift dramatically. The floor moves slowly. A player who maintains their skills maintains their VORP. The investment in improvement pays stable returns.

That third property is the one AI breaks.

The Software Engineer's VORP⚓︎

Trey Causey made the connection explicit in a reply to Ball's thread: "Sports analytics uses this concept a lot, even formalising it into 'VORP' or 'Value Over Replacement Player.'" The analogy writes itself. What is a software engineer worth, relative to the replacement-level alternative?

Define engineer VORP analogously:

VORP_eng(t) = [V_eng(t) − V_rep(t)] × T

Where V_eng(t) is the value produced per unit time by a specific engineer, V_rep(t) is the value produced per unit time by the replacement-level alternative, and T is the time period. The replacement-level alternative in software has traditionally been a junior hire, an outsourced contractor, or an offshore team: cheap, available, functional but not exceptional.

Until recently, V_rep was roughly stable. A junior developer in 2015 could do approximately what a junior developer in 2020 could do. The replacement floor moved with language ecosystems and tooling, but slowly. An experienced engineer who stayed current maintained their VORP without extraordinary effort.

AI changed the derivative.

V_rep(t) is no longer approximately constant. It's increasing, and the rate of increase is itself accelerating. The replacement-level alternative in February 2026 is not a junior hire. It's an AI agent that can write functional code from a spec, debug common errors, refactor to match patterns, generate tests, and produce documentation. Thorsten Ball himself describes the output: "custom scripts running on my Raspberry Pi; a clone of a Windows app to control my ventilation system; an archive of my newsletter I can search and browse." Not production enterprise software, necessarily, but the kind of work that junior and mid-level engineers have traditionally done to justify their seat.

The formal consequence:

dVORP_eng/dt = dV_eng/dt − dV_rep/dt

For an engineer to maintain their VORP, the rate at which their value increases must exceed the rate at which the replacement level rises. If dV_eng/dt < dV_rep/dt, VORP declines even if the engineer is objectively improving. You can be getting better and falling behind simultaneously.

This is the property that breaks. In baseball, dV_rep/dt ≈ 0. An aging player can maintain VORP for years by holding steady. In software, dV_rep/dt > 0 and accelerating. The floor is rising faster than most engineers are climbing, and the treadmill is speeding up.

Sebastian Sigl caught this in the thread: "And the follow-up question nobody asks: what happens to the average engineer who stops growing because AI handles their daily work? Better than average today could mean worse than average in two years if the skill atrophy is real."

The math makes his intuition precise. If V_eng is constant (the engineer stops growing because AI handles their daily execution) and V_rep is increasing (AI capabilities improve quarterly), VORP goes negative on a timeline determined by the gap between current V_eng and the trajectory of V_rep. For an engineer whose value was primarily in code execution, the task AI commodifies most directly, that timeline is short.

The Adoption Game⚓︎

Now the game theory.

Model AI tool adoption as a symmetric two-player game. Two software engineers, each choosing between Adopt (A) and Don't Adopt (D). The payoffs reflect VORP outcomes, not absolute productivity.

Why VORP and not absolute output? Because the labour market prices engineers relative to alternatives. Your salary reflects your value above what a replacement could provide. If everyone gets more productive but the replacement level rises equally, wages don't increase. The relevant payoff is relative position.

The payoff matrix:

                     Engineer 2
                     Adopt (A)      Don't (D)
Engineer 1
  Adopt (A)          (3, 3)         (5, 1)
  Don't (D)          (1, 5)         (4, 4)

The numbers are ordinal, not cardinal, but the structure matters.

When only one engineer adopts, the adopter gains high VORP: their productivity jumps, the industry-wide replacement level hasn't risen yet, and they're suddenly producing well above baseline. The non-adopter's VORP craters. They haven't changed, but the market's sense of what a "replacement" can do has shifted upward by their competitor's example.

When both adopt, the replacement level rises and both VORPs compress to moderate levels. They're both more productive in absolute terms. Neither has gained relative ground.

When neither adopts, both maintain their current VORP. The replacement level hasn't budged. This outcome, paradoxically, leaves both engineers with higher relative value than mutual adoption does.

Check the inequalities. Temptation (5) > Reward (4) > Punishment (3) > Sucker (1). This is a Prisoner's Dilemma.

The dominant strategy is Adopt. Regardless of what the other engineer does, adopting yields a higher payoff: 3 > 1 if they adopt; 5 > 4 if they don't. Both engineers adopt. The Nash equilibrium is (A, A) with payoffs (3, 3). But mutual non-adoption (D, D) yields (4, 4), which Pareto dominates. Both players would be better off if neither adopted, but neither can unilaterally choose not to adopt without risking the sucker payoff of 1.

Tino Wening named it: "The whole 'outrun the other guy' mindset is exactly why the Bay Area is stuck in a Prisoner's Dilemma. You're taking a destructive Nash equilibrium and acting like it's business wisdom."

He's technically precise. The tiger joke isn't wisdom. It is a description of the trap. "Outrun the other guy" is the dominant strategy in a Prisoner's Dilemma, and the outcome is worse for everyone than the cooperative alternative.

N Players and the Ratchet⚓︎

The two-player case is instructive but unrealistic. The software industry has millions of participants. Extend the model.

N engineers, each choosing adoption intensity a_i ∈ [0, 1]. The aggregate adoption level A = (1/N)Σa_i determines the replacement level:

V_rep(A) = V_rep₀ + αA

Where V_rep₀ is the baseline replacement level without AI adoption and α is the sensitivity of replacement level to aggregate adoption. Each engineer's VORP becomes:

VORP_i = V_i(a_i) − V_rep(A)

Where V_i(a_i) is increasing in a_i (adopting AI raises your individual productivity). Each engineer maximises their own VORP. The first-order condition:

∂V_i/∂a_i − α/N = 0

Each engineer internalises their own productivity gain from adoption but bears only 1/N of the externality they impose on everyone else's replacement level. This is a textbook negative externality, structurally identical to the tragedy of the commons. The private marginal benefit of adoption exceeds the social marginal benefit by a factor that grows with N. Everyone over-adopts relative to the social optimum.

The Nash equilibrium has every engineer at maximum adoption. The replacement level rises to V_rep₀ + α. Individual VORPs compress. The aggregate VORP across all engineers may actually decline relative to the pre-AI state, even though absolute productivity increased. The race made everyone faster and nobody relatively richer.

But the game doesn't play once. It repeats, and each round ratchets the floor higher.

Period 1: Early adopters gain VORP advantage. Non-adopters observe and feel the pressure.

Period 2: Non-adopters must adopt to recover lost VORP. Aggregate adoption A rises. V_rep rises.

Period 3: Early adopters' advantage has eroded. To maintain VORP, they must adopt more intensively: better tools, deeper integration, more autonomous workflows. A rises further. V_rep rises further.

Period t: The replacement level satisfies V_rep(t) = V_rep₀ + α × A(t), where A(t) → 1 as t → ∞.

This is a Red Queen dynamic. You run faster to stay in the same place. The equilibrium has everyone running at maximum speed and no one gaining ground. The tiger joke has a sequel nobody tells: after everyone tightened their shoelaces, the tiger ate the slowest runner. Then everyone tightened again. Then the tiger ate the next slowest. The race never ends because the replacement level ratchets up with every round.

The Queuing Catastrophe⚓︎

Neither the VORP model nor the game theory captures what happens when the pipeline saturates. Armin Ronacher identified this in "The Final Bottleneck": software development is not a single-stage production process. It's a pipeline, and pipelines have throughput constraints.

Model the software delivery pipeline as a queue:

λ(t) = rate of code creation (PRs per unit time)

μ = rate of code review and integration (PRs per unit time)

L(t) = queue length (backlog of unreviewed PRs)

When λ < μ, the queue is stable. PRs get reviewed roughly as fast as they're created. The backlog stays bounded, and average wait time converges to μ/(μ−λ).

When λ > μ, the queue is unstable. The backlog grows without bound:

L(t) ≈ (λ − μ) × t

AI adoption increases λ dramatically while μ (human review and integration capacity) remains roughly constant. Ronacher's example is concrete: the OpenClaw project has over 2,500 open pull requests. That isn't a backlog that will clear with more reviewers; it's a queue growing faster than any team can process.

"Anyone who has worked with queues knows this," Ronacher writes. "If input grows faster than throughput, you have an accumulating failure. At that point, backpressure and load shedding are the only things that retain a system that can still operate."

The queuing insight intersects with VORP in a way that neither framework predicts alone. An engineer's raw VORP might be high: they produce substantial value per unit time, well above replacement. But if their output enters a queue and sits unreviewed, the realised value is delayed or zero. Code that never ships has no value, regardless of its quality.

Define throughput-adjusted VORP:

VORP_eff = [min(V_eng, μ_system) − V_rep] × T

Where μ_system is the system's review and integration throughput. An engineer producing at rate 10μ has the same effective VORP as one producing at rate μ. Above the throughput ceiling, additional output doesn't increase value. It increases backlog.

Worse: excessive output imposes negative externalities on system throughput. Each additional PR in the review queue competes for the same finite review capacity, increasing wait times for every other PR, creating merge conflicts, and fragmenting reviewer attention. An engineer who floods the queue with AI-generated PRs doesn't just fail to add value above the throughput ceiling. They actively reduce the effective throughput available to everyone else.

This inverts the standard VORP calculation. In baseball, more plate appearances are unambiguously good if your per-PA rate is above replacement. In the software pipeline, more output can be worse than less output once the queue saturates; the sign on marginal contribution flips from positive to negative. The engineer who writes less, but writes reviewable, shippable, integrate-on-first-read code, may have higher effective VORP than one who generates five times the volume.

Ronacher's bathtub metaphor: "Software engineers often believe that if we make the bathtub bigger, overflow disappears. It doesn't."

Ethan Fann threads the needle from the practical side: "Writing code isn't the bottleneck anymore. What to build and why is a bottleneck. QAing what gets built is a bottleneck. Knowing when to rip something out and replacing it is a bottleneck. Pretty much, taste is the bottleneck."

Taste, in the queuing model, is the ability to maximise effective throughput per unit of review capacity consumed. The engineer with taste produces output that clears the review queue efficiently: well-scoped, well-tested, comprehensible, aligned with what the system actually needs. This is a different kind of skill altogether from producing output fast. The VORP model, unadjusted for throughput, cannot distinguish between the two. The throughput-adjusted version can.

The Production Function Nobody Wrote Down⚓︎

The analysis now needs to confront its own assumption. VORP requires a production function: a measurable relationship between inputs (engineer time, skill, tools) and outputs (value created). Baseball has one. Plate appearances, hit types, run values; the whole apparatus rests on observable, countable events with well-established run expectancies derived from decades of play-by-play data.

Software has never had one.

Lines of code, story points, PRs merged, bugs fixed, features shipped: proxy metrics, and everyone in the industry knows they're unreliable. A 500-line PR that deletes dead code and simplifies an architecture may be worth more than a 5,000-line feature that introduces technical debt. A week of investigation that results in "we shouldn't build this" may be the most valuable output a team produces all quarter. The relationship between observable outputs and actual value has never been reliably specified.

This doesn't invalidate the VORP framework. It transforms it. VORP applied to software engineering is illuminating not because it produces clean numbers but because the attempt to calculate it exposes exactly what we don't understand about software value creation. The model fails to compute, and the failure is informative.

Consider what the model requires and what software can't provide:

Replacement level. In baseball, you can observe replacement-level performance across thousands of games at every position. In software, the replacement level is theoretical. What would an AI agent produce if given this task? The answer depends on the task, the codebase, the organisational context, the quality of the specification. There is no single replacement level; there's a distribution, and the distribution has fat tails in both directions.

Value per unit time. In baseball, run values are stable and well-calibrated across decades. In software, the value of a unit of work depends on what the market rewards, what the organisation needs, and what the system can absorb. An engineer who ships a critical security patch on a weekend produces more value in four hours than one who refactors CSS for a month. The same engineer, producing the same code, on a different Tuesday, for a different team, creates different value. Context is load-bearing.

Playing time. In baseball, plate appearances are observable and roughly equal in duration. In software, what counts as "playing time"? Hours at keyboard? Hours thinking about the problem while walking the dog? Hours in a meeting that prevents another team from building the wrong thing? The denominator in the VORP equation is undefined.

The VORP formula applied to software is a thought experiment, not a computation. But thought experiments have a distinguished history in economics. They clarify dynamics that empirical data can't isolate because the variables aren't separable. What the software VORP thought experiment clarifies: the structure of the value problem (marginal contribution above a rising replacement level, constrained by system throughput) is real even if the magnitudes are unknowable. The structural dynamics follow from the structure alone. The Prisoner's Dilemma, the ratchet, the queuing catastrophe: none require precise numbers. They require only two inequalities: dV_rep/dt > 0, and λ > μ. Both are observable. Both are true.

What Sits Above the Rising Floor⚓︎

Assume the structural dynamics are correct. Replacement level rising, queuing constraint binding, adoption game locked in destructive equilibrium. What remains above the floor?

Ronacher's answer: accountability. "Non-sentient machines will never be able to carry responsibility, and it looks like we will need to deal with this problem before machines achieve this status. Regardless of how bizarre they appear to act already." The bottleneck isn't skill or speed or even judgment in the abstract. It's the willingness to be the person who signs off. The person who gets called when it breaks at 3am.

This echoes jbnews in the thread: "You are not the bottleneck, you are the accountable party."

Accountability is a strange kind of value. It doesn't show up in any production function. It isn't correlated with lines of code, PRs merged, or features shipped. It is the irreducible human contribution that exists because organisations need someone to be wrong. To own decisions, accept consequences, absorb blame when the machine's output creates problems that the machine cannot itself be held responsible for. In the formal model, accountability is the residual: whatever remains in V_eng after every measurable, automatable component of value has been absorbed into V_rep.

Aristotle called this kind of capacity phronesis: practical wisdom, the ability to perceive the right action in particular circumstances, developed through experience and character, fundamentally different from theoretical knowledge or technical skill. If post-AI software engineering is about phronesis, then the path above replacement level runs through accumulated judgment, contextual understanding, and the kind of wisdom that only years of operating within a specific domain can produce.

But Pierre Bourdieu would complicate this considerably. What looks like cultivated judgment, Bourdieu argued throughout his career, often functions as cultural capital: a way of maintaining status and excluding newcomers that masquerades as genuine expertise. "We need experienced engineers for their judgment" might describe a real capability. It might also describe a professional guild protecting its economic position by mystifying what it does. The fact that we can't write down the production function, that judgment resists measurement, is precisely what makes it useful as a status claim. If you can't measure it, you can't prove someone doesn't have it. And if you can't prove that, incumbency becomes its own credential.

Both are partially right, and the tension between them is where the strategic analysis lives.

Some of what we call engineering judgment is genuine phronesis. The architect who sees the subtle interaction between two systems that will cause problems at scale. The tech lead who recognises that the team is solving the wrong problem before anyone has written a line of code. The senior engineer who intuits that the elegant solution has a maintenance cost that will compound for years. This judgment is real, hard-won, and not yet reproducible by AI. It sits above the replacement level because it requires the kind of accumulated contextual knowledge that current systems don't possess and may not develop for some time.

And some of what we call engineering judgment is cultural capital. The insistence on specific code review rituals whose value has never been measured. The preference for one architectural style over another that reflects training rather than evidence. The gatekeeping that keeps teams small and homogeneous under the banner of "quality." This judgment sits above the replacement level only because the professional culture agrees to value it, not because it produces measurable outcomes.

The difficulty, and this is where the opacity of the production function bites hardest, is that we cannot reliably distinguish between the two. We cannot point to a specific act of judgment and say "that was genuine phronesis" or "that was status performance" because the outputs of both are entangled with each other and with luck, timing, team dynamics, and market conditions. The model is honest about this: VORP for software engineers is structurally real and numerically unknowable. The floor is rising; what sits above it is partly skill, partly social construction, and we lack the instrumentation to tell them apart.

Escape Moves⚓︎

Are there strategies that escape the destructive Nash equilibrium? In game theory, Prisoner's Dilemmas have three well-known resolution mechanisms: iteration, coordination, and game change.

Iteration works when the game repeats and players can develop cooperative strategies through reciprocity. Tit-for-tat, the classic iterated PD strategy, cooperates initially and mirrors the other player's previous move. In the AI adoption game, iteration would look like professional norms: "we adopt AI at a measured pace, maintaining code review standards, not flooding the review queue." This is roughly what the deliberate friction camp advocates. Tommy Falkowski: "I increasingly think that deliberate friction is a necessary part of life." Giovanni Barillari: "I'm happy to be slow. You miss a lot otherwise tbh."

The problem with iteration is defection at scale. In a market with millions of participants, monitoring is impossible. You cannot enforce a norm of measured adoption when any individual engineer or company can defect by adopting faster and capturing temporary VORP advantage. Iterated PD resolution requires a small, identifiable group of repeated players who can observe and punish defection. The global software labour market is not that.

Coordination works through binding agreements. Unions, professional standards bodies, licensing frameworks. If the industry collectively agreed on adoption standards, the Prisoner's Dilemma collapses: everyone cooperates, replacement level stays manageable, VORPs stabilise. Patrick Senti gestures at this: "It stands to reason that productivity is not increased until throughput is. Thus we just quite simply reduce the pace to the slowest part of the pipeline."

The problem with coordination is that software engineering has virtually no binding coordination mechanisms. No union in most jurisdictions, no professional licensing, no standards body with enforcement power. The largest actors (the companies that employ the most engineers and set market expectations) have strong incentives to defect from any coordination agreement because they can absorb the throughput costs that smaller firms cannot. A coordination solution requires exactly the institutional infrastructure that the software industry has historically resisted building.

Game change is the most interesting resolution, and the most structurally different. Instead of playing the existing game better, you change the rules entirely.

Rose Bennett in the thread: "This is the 'horseless carriage' problem again. We called cars horseless carriages until we forgot about the horse entirely. Same thing is happening with agent-assisted PRs."

Devesh Pal: "Tickets and PRs are artifacts of human coordination overhead. Agents don't need tickets to coordinate with themselves."

The game-change move is to redesign the production process so the queuing constraint doesn't bind. If code review is the bottleneck, don't try to review faster; redesign the workflow so that review is embedded in creation. AI writes the code and validates it simultaneously. The human's role shifts from reviewing individual PRs to governing the system that produces and validates them. Yoav Tzfati formalises the economics: "I think we'll have to switch from reviewing code to having the AIs write much much better tests to prove that the behavior is right. Spending 10x the tokens on testing makes sense if it saves humans 2x on review."

In game-theoretic terms, this changes the payoff matrix entirely. The game is no longer "adopt AI tools within the existing workflow" but "redesign workflows so the throughput constraint becomes irrelevant." The first game is a Prisoner's Dilemma. The second game has a different structure: if workflow redesign increases both individual and collective throughput by removing the queue entirely, it can be a coordination game where the Nash equilibrium is also Pareto-optimal. Both players prefer the outcome where both redesign, and neither has an incentive to unilaterally defect back to the old workflow.

The question underneath this game change is where the commoditised code flows. Kojin Karatani's framework of exchange modes offers a useful lens. Does faster, cheaper code creation feed into platform extraction: Copilot, Cursor, the AI-assisted IDEs that capture the productivity gains as subscription revenue? That's Karatani's Mode B, capital accumulation, the platform as factory owner. Or does it feed into a software commons: open-source tools, shared infrastructure, collective capability that raises the absolute standard of living for all engineers even as it compresses individual VORP? That's Mode D, reciprocal exchange, the commons as mutual benefit.

The individual-level Prisoner's Dilemma of AI adoption may be inescapable within the current game. The higher-order question, which game the industry is constructing, is still open.

The Coasean Twist⚓︎

Ball's other thread, the one about disposable software, points to something the formal model cannot accommodate:

"That makes me wonder whether for a certain category of software it's just no longer worth it to generalise and put the effort in to 'solve it once and for all for everybody', because the cost of 'everyone creates their individual software for their need' is now lower than it was before."

This is Coasean in structure. Ronald Coase's theory of the firm rests on comparing two costs: the cost of transacting in the market versus the cost of coordinating internally. Firms exist because internal coordination is cheaper than external transaction for certain activities. When transaction costs fall, firms shrink. When coordination costs fall, firms grow.

AI is collapsing the cost of bespoke software creation towards zero. Ball's custom Raspberry Pi scripts, his ventilation system clone, his newsletter archive: software for an audience of one. Previously, creating bespoke software for each user exceeded the cost of building generalised solutions sold at scale, so we got products. When creation cost approaches zero, the Coasean calculus inverts. It becomes cheaper for each person to have software built to their exact specification than to tolerate a generalised product that approximately fits.

His print service example crystallises it. ChatGPT Pro ran for 10-15 minutes, wrote Python code, produced a 4-page PDF formatted to a print service's specifications, and the code vanished. "Is that 'software' as we know it? Maybe for a minute. But now it's gone." The software was ephemeral. The value was in the output, not the code.

If this category grows, and Ball thinks it will, it scrambles the VORP analysis in a fundamental way. VORP assumes a production function where more of the output is better and where the output is durable enough to be valued. In a world of disposable, hyper-personal software, "the output" isn't a product that ships to users. It's ephemeral code that solves a problem for thirty seconds and disappears. The AI isn't replacing an engineer. It's replacing the decision not to build. The counterfactual isn't "a junior dev would have done this." It's "this wouldn't have existed at all."

For this category, there is no queue, no review, no backlog, no team, and no Prisoner's Dilemma. The software lives and dies in the time it takes to run, and the concept of "replacement level" is as meaningful as the concept of a replacement-level text message.

The formal models (VORP, the adoption game, the queuing analysis) describe the pressures facing engineers who build durable systems for organisations: software that is maintained, shipped, relied upon, and accountable to users. Those pressures are real, structural, and not resolvable by individual skill improvement alone. But the models also illuminate their own boundary. A growing fraction of software production is escaping the frameworks entirely, into territory where code is as disposable as a conversation and the concept of "replacement" doesn't apply because there was never a position to replace.

The Floor and the Field⚓︎

Both categories share one feature: the floor is rising. Whether you are an engineer building durable systems or a person prompting disposable scripts, the capability of the replacement-level alternative increases with each model generation.

For durable software, the strategic question is how to maintain VORP above a floor that won't stop moving, within a game that punishes collective adoption, constrained by a pipeline that can't absorb the output. The math points toward three unstable positions: invest in non-automatable judgment (genuine phronesis, not cultural capital), redesign the workflow to change the game itself (eliminate the queue rather than trying to process it faster), or build the platforms that capture value from everyone else's adoption (own the infrastructure of the Red Queen race rather than running in it).

None of these are stable equilibria in the long run. Judgment gets automated when it becomes legible. Workflow redesigns create new bottlenecks further downstream. Platforms face competition from other platforms and from the commons. The floor keeps rising, and the field reshapes around each new contour.

For disposable software, the question is different and in some ways harder to hold. What does a profession look like when much of its traditional output becomes something anyone can conjure in minutes? When the valuable activity is no longer writing the code but knowing which code to conjure, for what purpose, with what constraints? This is Ronacher's final point, reframed: the bottleneck was always human, and it remains human. The nature of the bottleneck is what's changing.

The math describes the dynamics; the game theory, the trap; the queuing model, the constraint. None prescribe the outcome, because the outcome depends on which game we choose to play. And that choice, for now, is the one thing the models confirm sits above replacement level: the capacity to decide, not just what to build, but what kind of building to do.

Sources⚓︎

Keith Woolner, VORP methodology, Baseball Prospectus (2001-): baseballprospectus.com
Thorsten Ball, "Joy and Curiosity #74" (2026): registerspill.thorstenball.com/p/joy-and-curiosity-74
Thorsten Ball, X threads on agents, disposable software, and the average engineer (Feb 13-17, 2026): x.com/thorstenball
Armin Ronacher, "The Final Bottleneck" (2026): lucumr.pocoo.org/2026/2/13/the-final-bottleneck/
Trey Causey, reply connecting VORP to software engineering (Feb 16, 2026): x.com/quastora
Tino Wening, reply on Prisoner's Dilemma and Nash equilibrium (Feb 15, 2026): x.com/TinoWening
Sebastian Sigl, reply on skill atrophy (Feb 15, 2026): x.com/sesigl
Ronald Coase, "The Nature of the Firm" (Economica, 1937)
Kojin Karatani, The Structure of World History (Duke University Press, 2014)
Pierre Bourdieu, Distinction: A Social Critique of the Judgement of Taste (Harvard University Press, 1984)