From pilot to production in 90 days — how CDF eliminates Pilot Purgatory in software development

Mar 8
7 min read

Series: CDF 1.3.2 in practice — 6 articles on the methodology of sovereign AI implementations

This is the second article in the series (G1). In the previous issue: Cognitive SLA (S1) — AI reasoning quality metrics. The GENESIS series focuses on agent governance, scaling, and operations. The SAVANT series focuses on compliance, quality measurement, and oversight. CDF 1.3.2 is a proprietary methodology developed by allclouds.pl, based on ISO/IEC 42001:2023 and the EU AI Act.

In AI projects, the hardest part is not building a demo. The hardest part is getting the solution to the point where it works stably in the business process, has an owner, metrics, costs, governance, and a scaling path.

This is where many initiatives fall into what CDF 1.3.2 calls Pilot Purgatory. The prototype works, the team is happy, the presentation to the board looks good — but the implementation does not go into production or cannot move beyond the experimental stage for months. The project is not formally closed, but it does not become part of real work either. It continues.

The problem does not start after the pilot

Many organizations assume that you first need to "build something quickly" and only then organize the architecture, costs, roles, compliance, and operating model. This is intuitive, but it is precisely this order of actions that very often leads to getting stuck.

CDF 1.3.2 identifies five types of AI transformation blockages: Pilot Purgatory, Scale Paralysis, Governance Gridlock, Cultural Resistance, and Data Readiness. The very inclusion of Pilot Purgatory in the formal diagnostic model shows that the problem is not an exception, but a recurring organizational pattern.

In practice, pilots that get stuck for months rarely fail because the model "doesn't work." More often, they fail because no one has defined in advance how to measure success, when the project should go into production, who makes the decision to scale, and how much the production environment will actually cost.

CDF starts with the question: what happens next?

The most important difference in the CDF approach is that the path to production is not added after the pilot. It is defined at the very beginning — in Phase 0 — before the organization delves deeper into building the solution.

The methodology requires two critical elements here. The first is Innovation Plateau Diagnostic, which identifies the type of transformation blockage and prepares an individual remediation plan. The second is Scale Path Definition, which is the mandatory definition of pilot exit criteria, the path to production, the Production Cost Model, and kill/pivot criteria.

This approach changes the logic of the project. The team does not build a "pilot with hope" that it will somehow be possible to launch it more widely later. It designs the initiative from the outset as a potential production implementation with a clearly defined transition path.

Scale Path Definition uses the results of the Sovereignty Level Assessment (SLA-S), which determines the required level of sovereignty for a given use case. We discuss how CDF selects an implementation model—Hybrid, On-Premise, or Air-Gapped—in the article "Sovereignty Level Assessment—how to choose the right level of AI sovereignty and not overpay."

What Pilot Purgatory really is

Pilot Purgatory is a situation in which an organization can demonstrate a working proof of concept but is unable to transform it into a production solution. The project is not formally closed, but it also does not scale, does not receive a full budget, does not have a stable operating model, and does not become part of a real business process.

This is particularly common in AI and software development, where it is very easy to build an impressive demonstration, but much more difficult to ensure cost predictability, security, accountability, and quality control in production. The more a solution relies on agents, integrations, and autonomy, the more important governance and the architecture of the transition from experiment to operation become.

That is why CDF treats the transition to production as a separate design problem, rather than a natural "next stage" after a successful demo. Without this, many teams confuse activity with progress: they build more features, but do not get any closer to actual launch.

Four things that must be established before launch

Scale Path Definition is one of the strongest elements of CDF. The methodology requires four things to be defined at the outset:

Element	What it defines	Example
Pilot exit criteria	Measurable pilot success criteria that determine when to move to production	Reasoning Accuracy ≥90%, Hallucination Rate ≤5% for 30 days
Path to production	Conditions, decision-makers, dependencies, and schedule for entry into the production environment	CTO Signature + Security Review + Agent Governance Approved
Production Cost Model	Full production environment costs, not just experiment costs	GPU infrastructure, inference costs, licenses, CogOps FTE
Kill/pivot criteria	When to stop or pivot a project	Accuracy <80% after 60 days, TCO >150% of budget, no business sponsor

Such project discipline can be uncomfortable at first, but it saves months of chaos later on. Instead of "dragging out the pilot," the organization has a predetermined investment and operational decision-making mechanism.

Pilot exit criteria are most often based on Cognitive SLA metrics: Reasoning Accuracy, Hallucination Rate, and Confidence Calibration. We describe the full table of seven metrics in the article "Cognitive SLA — why 99.9% uptime is not enough when AI supports decisions in your company."

Agents need governance, not just code

In classic software projects, the transition from MVP to production is difficult anyway. In AI agent-based projects, this problem is even greater because of issues such as levels of autonomy, reasoning quality, monitoring, agent registry, and inference cost control.

CDF takes this specificity into account from Phase 2 onwards, where Agent Governance becomes a mandatory part of the architecture. The Central Agent Registry stores nine mandatory fields for each agent — from role and autonomy level, through token budgets, to Kill-Switch Authority and audit history.

If an autonomous software development platform generates or orchestrates developer agents, the transition to production cannot rely solely on increasing their number. It must also mean subjecting them to a formal governance model — with assigned responsibilities, cost limits, and emergency procedures.

This is the difference between scaling and uncontrolled multiplication of agents. CDF enforces this distinction before a problem arises, not after the fact.

The complete Agent Governance model — Agent Registry, three interaction patterns, three types of Kill-Switch, and emergency procedures — is described in detail in the article "Agent Governance — how to manage a swarm of 50 AI agents without losing control."

Structure, not just speed

GENESIS-AI is not meant to be just a tool for faster software generation. It is meant to be a platform that, thanks to CDF, provides a structure for transitioning from experiment to scalable operating model. Each agent can operate within its assigned level of autonomy, token budget, and emergency permissions, and the project can be accounted for against pre-determined milestones.

CDF additionally introduces milestones settled every 90 days and an ROI Baseline Report that takes into account the J-curve of productivity — an initial decline in efficiency during the implementation phase, leading to exponential acceleration. Thanks to this, the project is not evaluated solely through the prism of initial enthusiasm, but through its ability to deliver measurable value over time.

90 days is not a slogan, but a decision-making rhythm

The title "90 days" is best understood not as a promise of magical implementation of every project in three months, but as a way of organizing decisions and accountability. CDF assumes measurable milestones every 90 days, not a never-ending experiment without checkpoints.

This is a very important distinction. An organization does not need another pilot that "will take a while." It needs a rhythm in which it is clear what has been tested, what has been delivered, what production costs are, and whether the project meets the conditions for further scaling.

For software houses and development teams, this rhythm is particularly valuable. Instead of an open-ended "we're doing AI" horizon, they get a cycle with clear evaluation points — every 90 days, someone has to answer the question of whether the initiative is moving toward production or stuck in Pilot Purgatory.

From Deployment Pattern Library to Enterprise AI Catalog

Projects that move from pilot to production do not end with launch. The CDF envisions a further path in Phase 5: scaling and industrialization. Key elements include the Deployment Pattern Library — cataloged, reusable deployment patterns — and the Reusable Asset Registry, where the Center of Excellence defines standards and federated teams in Business Units implement ready-made solutions.

Every successful production project becomes a model to be replicated. Every agent that has gone through the full cycle from Design through Build, Test, Deploy, Monitor to Optimize/Retire can be the basis for subsequent implementations in new departments or with new customers.

This is the moment when the organization stops "doing AI projects" and starts building operational capacity. But that moment will never come if the first pilot does not have a path to production.

What to check before you start

If an organization is building a software solution involving AI or agents, it is worth asking a few questions before starting development:

What are the exit criteria for the pilot?
Who will make the decision to go into production and on what basis?
Is there a Production Cost Model for production scale, not just for the demo?
What are the kill/pivot criteria if the project does not meet its objectives?
Will agents be subject to formal agent governance or just improvised orchestration?
Does the organization measure progress by milestones or by a general sense that it is "moving forward"?

If there are no answers to these questions, the risk of Pilot Purgatory is very high. Then even a technologically sound project can get stuck because there is no designed path to production.

The most important lesson from CDF is simple: the transition from pilot to production does not begin after the pilot. It begins when the organization defines the way to measure success, the scaling path, costs, governance, and decision conditions before construction.

The methodology does not promise an easier path. It organizes what in most companies remains unsaid until it is too late. And in software development, where the pace is increasing and agents are growing month by month, this decision-making structure is the difference between a project that matures and a project that gets stuck.

Once the path to production is defined, the question of sovereignty arises: where to put the infrastructure, which implementation model to choose, and how not to overpay. In the next article, we describe Sovereignty Level Assessment — a formal assessment of AI sovereignty across four dimensions.

1 Comment

Marcin Kaźmirak

Apr 03

Pilot Purgatory to częsty scenariusz w firmach które zaczynają przygodę z AI. Świetne demo, zarząd się cieszy, a potem projekt wisi miesiącami bo nikt na starcie nie ustalił co dalej. Podejście żeby definiować kryteria sukcesu i ścieżkę do produkcji przed budowaniem czegokolwiek brzmi jak oczywistość, ale w praktyce prawie nikt tego nie robi. Rytm 90-dniowy z twardymi punktami decyzyjnymi to dobry sposób żeby wymusić realne rozliczanie postępów zamiast ciągłego "jeszcze chwilę i będzie gotowe".