The Decision Loop

DORA measures how fast code flows from commit to production. MOVE measures how fast the organization senses, decides, and acts. Both track real performance. Both share an assumption: that someone, somewhere, understood the system well enough to make the right call.

That assumption has a cost, and it's larger than most organizations realize.

The Loop⚓︎

Every knowledge worker in software follows the same cycle. Form a hypothesis about the system. Explore it. Assess what you find. Decide what to do next. Engineers trace code paths. Product managers interrogate usage patterns. Designers audit component states. QA engineers reconstruct failure chains. The subject changes. The loop does not.

Two variables govern its speed. Tudor Girba and Simon Wardley name them in Rewilding Software Engineering, their book-length argument that the software industry has been optimizing the wrong variable for decades.

Time to Question (ttQ) — the time to formulate a specific, actionable question about a system.

Time to Answer (ttA) — the time from that question to a verified answer.

Nobody tracks either. Most organizations have never discussed them.

Time to Question⚓︎

The time to formulate a specific, actionable question about a system.

For a question to drive a decision, Girba and Wardley argue it must be three things. Actionable: if answered, someone can do something with the result. Specific: relevant to this system in this context. Timely: the answer will arrive before the environment changes enough to invalidate it.

Most questions in organizations fail at least one criterion. "Why is retention dropping?" is actionable but too broad; the answer space is too large to guide action. "What's the P99 latency of the payment service?" is specific but may not be actionable for the person asking, because they lack the context to interpret what the number means for their decision. "How should we restructure the monolith?" fails all three if the answer takes six months and the organization will have reorganized twice by then.

A large corporation needed to optimize a central data pipeline by an order of magnitude. The business case was clear. After years of effort and millions of dollars, data moved through the pipeline at the same speed as before. The team was capable, the resources were there. The constraint was visibility.

They started where everyone starts: "Why are our services slow?"

Reasonable question. Almost useless, because answering it in the general case requires understanding the entire system, which is the problem that produced the question.

Weeks of exploration refined it. "Why are our services slow?" became "Is there useless data generated?", then "How is data traversing the pipeline?", then "How is a single attribute traversing the pipeline?", which, only after cheap answers made the landscape visible, became "What data is used only in the most recent campaign?"

Each refinement was more specific, more actionable, more valuable. The distance between the first question and the last is Time to Question made visible. In this case, it was measured in weeks.

Time to Answer⚓︎

The time from a specific question about a system to a verified answer.

Time to Question gets less attention, but Time to Answer is what makes it expensive.

Tracing how a single variable moved through the pipeline required person-weeks of manual inspection. The system contained tens of thousands of variables. At that rate, answering even basic structural questions was economically prohibitive. The team couldn't verify their own hypotheses. Their best model of the system, a hand-drawn architecture diagram, omitted an entire third-party system that nobody knew existed. Decisions were being made against a picture that was materially wrong.

This same structure appears everywhere the decision loop runs.

An engineer asks "what calls this service?" and spends days reading code across repositories because the dependency graph lives on a whiteboard, drawn from memory, last validated months ago. A product manager asks "which features drive retention in segment X?" and waits two weeks for a custom query because the analytics platform organizes data by event type, not user journey. A designer asks "what states can this component reach in production?" and manually navigates edge cases because the design system documents intended states, not actual ones.

In each case, formulating the question took seconds. Extracting a verified answer required days or weeks: reading code, building queries, navigating tools that weren't designed for the question being asked. Girba and Wardley use a recurring image for this: kitchen blenders digging deep mines. The tool works. It wasn't designed for this.

The Flywheel⚓︎

When Time to Answer drops, when answering a question about your system takes minutes instead of weeks, you can afford more questions. More questions means more explored landscape. More landscape means sharper hypotheses. Sharper hypotheses target higher-value answers. And higher-value answers reveal territory you didn't know existed: what Girba and Wardley call the adjacent unexplored.

The data pipeline team built 54 contextual micro-tools in two person-months. Each tool answered a specific class of question directly from the live system, the way automated tests generate pass/fail signals from running code. The average tool was twelve lines of code. Total investment: a fraction of what manual investigation had consumed over years.

The improvement factor was roughly 600x. The number understates what actually changed. Cheap answers made iterative exploration affordable, and that exploration revealed that the original question was aimed at the wrong part of the system entirely. The team couldn't have skipped to the right question. The path through wrong questions was the only path to the right one. What made the difference was a cost structure where each wrong question didn't consume person-weeks.

The reverse is equally instructive. When answers are expensive, each question is an investment. Teams choose carefully, conservatively, optimizing for probability of payoff over breadth of discovery. This is how organizations spend years and millions on a problem that two people with contextual tools resolve in a month.

This is why Time to Question improves when Time to Answer drops. The data pipeline team didn't learn to ask better questions through training. They learned because cheaper answers revealed more landscape, and more landscape made sharper questions possible. Cheap answers accelerate the accumulation of system knowledge that produces good questions, while expensive answers lock teams into generic questioning regardless of their expertise.

The Layer Beneath⚓︎

Three frameworks now describe three layers of the same organizational system.

The Measurement Stack — three layers: DORA (deployment), MOVE (operations), Decision Loop (comprehension)

Layer	Framework	Measures
Deployment	DORA	How fast code ships: cycle time, deployment frequency, failure rate, recovery time
Operations	MOVE	How fast the org operates: Momentum, Orchestration, Velocity, Exception
Comprehension	Decision Loop	How fast people understand: Time to Question, Time to Answer

Each depends on the one beneath it. DORA assumes someone decided what to deploy. MOVE assumes someone designed the workflow correctly. Both assume comprehension that neither measures.

When MOVE metrics plateau, when Momentum stalls despite automation and Exception rates climb despite investment, the cause is usually comprehension speed. An organization with 60% Orchestration and rising Exception rates is often looking at this: the workflows were designed against an incomplete picture of the system, because the people who designed them couldn't get answers fast enough to ask the right questions.

This also explains why AI copilots improve MOVE metrics to a point and then stop. Copilots accelerate execution within the existing understanding of the system. They do not change how fast the organization comprehends its own systems. Momentum lifts, Orchestration plateaus around 25%, and the decision loop, still running on expensive answers and broad questions, remains the binding constraint.

Decision Profiles⚓︎

The two metrics cluster into recognizable patterns.

Tier	Time to Answer	Question Specificity
Elite	Minutes	Domain-specific
High	Hours	Moderately specific
Medium	Days	Generic
Low	Weeks+	Unformulated

Most organizations operate at Medium. Answers take days. Each question represents a significant investment, so teams choose questions conservatively. Exploration is treated as cost rather than a source of value.

Thresholds vary by question type and system complexity. Minutes to answer "what services depend on this table?" is elite for legacy migration. It is meaningless for "should we enter this market?" Set baselines per question class.

Getting Started⚓︎

Pick one question each role answered in the last month. Something real.

Time the answer. When was the question first articulated? When did a verified answer arrive? Count everything: queue time waiting for another team, tool-switching between platforms, manual inspection, meetings convened to discuss what the data means. That total is your current Time to Answer.

Audit the answer. Did it come from a generated view of the live system, or from a manually created artifact: a dashboard configured months ago, a diagram drawn from memory, a spreadsheet assembled from multiple sources? Was the tool built for this question, or was a general-purpose tool pressed into service?

Trace the question. Was this the first question asked, or the refinement of a broader one? How many cycles of question and answer did it take to reach this level of specificity? Where did refinement stall, because an intermediate answer was too expensive or the tooling couldn't support a sharper question? That path is Time to Question made visible.

Repeat across roles. Engineering, product, design, QA. The question types will differ. The extraction patterns will differ. The structural bottleneck will be the same: generic tools producing expensive answers, suppressing question quality, and limiting exploration.

The most expensive activity in software is figuring out what to build and whether what was built is right. Every operational era develops its own measurement language. Manufacturing brought throughput and defect rates. DevOps brought DORA. The AI-native organization brought MOVE. The decision loop is still unmeasured.

Sources⚓︎

Tudor Girba and Simon Wardley, Rewilding Software Engineering (feenk, 2025-2026): medium.com/feenk
Tudor Girba, "Developers spend most of their time figuring the system out" (feenk, 2024): blog.feenk.com