Featured AI

How to build a business case for AI before writing a line of code

Read time: 7 mins

A pre-development business case for an AI initiative functions as a technical constraint document: the cost of what is displaced, the accuracy threshold required to sustain that displacement, and the maximum allowable cost per unit of output together determine architectural decisions before any model is evaluated. Without these constraints defined in advance, accuracy targets become intuitive, infrastructure choices become arbitrary, and the decision to move to production becomes a matter of engineering preference rather than measurable evidence. The sequencing discipline, business case before technical evaluation, is what makes success criteria testable and completion definable.

Despite insane successes, we read about the vast majority of AI projects continue to fail. They do not because the technology stops working. They fail because nobody agreed, before a single line of code was written, on what "working" actually meant.

This is a structural problem, not a technical one. Engineering teams begin evaluating models. Product teams scope features. Leadership approves a budget. And somewhere in that sequence, the foundational question gets deferred: what does this need to achieve, at what cost, to justify the investment?

The consequences of skipping that question surface late and at great expense. A system reaches 94% accuracy and the team celebrates, until someone realizes the target was always 99%, because anything below that still requires the same manual review process it was supposed to replace. An AI agent successfully downloads utility bills from hundreds of provider portals, but the cost per transaction turns out to be three times what the existing supplier charges. The technology worked. The business case did not.

Building a rigorous business case before development begins is not a finance exercise bolted onto an engineering project. When done correctly, it becomes the architectural brief. It determines which model is good enough, which infrastructure is acceptable, and which constraints are non-negotiable.

The business case and the technical design are the same document.

When we are approached by company that want to enahance or replace business processes with AI and Agents we start with the business case as one of the first things:

AI Approach business case

The 3-phase process to deploy generative and agentic AI that we follow when working with our customers

What a pre-development business case actually requires

A useful AI business case answers three questions before any technical evaluation begins.

What existing spend does this replace?

Every AI initiative worth building displaces something: a third-party vendor, a manual process, a team performing repetitive work, or some combination of all three. Identifying and quantifying that displacement is the starting point. If the AI solution costs more than what it replaces, the investment logic collapses regardless of how impressive the technology is.

What accuracy does the replacement need to have for it to hold?

This is the question most teams skip, and it is the most consequential one. An AI system operating at 90% accuracy may still require significant human review, which means the labour cost it was supposed to eliminate persists. The accuracy target is not a technical aspiration; it is a business threshold. Below a certain point, the model does not replace the process, it adds a layer to it.

What is the maximum allowable cost per unit of output?

Whether the unit is a processed document, a transaction, a classification decision, or a customer interaction, there is a price point above which the economics do not work. Defining that ceiling before development begins means every subsequent architectural decision, which model, which infrastructure, which hosting arrangement, is evaluated against a concrete constraint rather than a vague preference for efficiency.

Example of a business case we recently put together for one of our customers

The relationship between the business case and the technical design

When these three questions are answered clearly, something important happens: the scope of the technical problem becomes specific. This is not incidental. It fundamentally changes how an engineering team approaches the build.

Consider the difference between "build an AI that extracts data from invoices accurately" and "build an AI that extracts data from invoices at 99% accuracy, at a cost of under $X per invoice, running entirely within our own infrastructure." The first framing produces a feature. The second produces a system with defined success criteria, testable benchmarks, and a cost model that can be validated before deployment.

In practice, the cost-per-unit ceiling often drives infrastructure decisions that would not otherwise be obvious. A model that achieves high accuracy using a commercial API may be technically impressive but commercially unviable at scale. That constraint pushes the engineering team toward self-hosted open-source models, a different set of trade-offs, different fine-tuning requirements, and a different deployment architecture. None of that becomes clear without the business case establishing the ceiling first.

Similarly, the accuracy target shapes the evaluation methodology. If the acceptable error rate is 1% across 100 extracted fields, teams need benchmark datasets, structured evaluation runs, and a rigorous understanding of where the model fails before committing to a particular approach. Without that target, benchmarking becomes qualitative and the scale-to-production decision becomes a guess.

Case study: U.S.-based energy intelligence platform

To see what this looks like in practice, let's take a look at a project we worked on that illustrates how directly a business case can shape technical architecture.

A U.S.-based energy intelligence platform was managing utility data for a large portfolio of commercial and industrial clients. Each month, utility invoices arrived from hundreds of providers across North America, each with a different format, different field structure, and a different layout. The client had been using a third-party service to collect and extract this data, but accuracy had degraded, and the manual correction overhead had grown to the point where the economics of the arrangement no longer held.

The brief, at its simplest, was to replace that third-party service with an AI-powered pipeline. Before any technical evaluation began, we worked with the client to define the business case precisely. What did the existing provider cost annually? What did the internal team spend correcting errors it introduced? What headcount was involved, and what would need to change for those costs to be reduced? The answers to those questions established a baseline: the total cost of the current state, across all its components.

From that baseline, two constraints followed. The first was a cost-per-invoice ceiling, the maximum the new system could spend per bill processed for the investment to generate a positive return over a three-year horizon. The second was an accuracy floor of 97%, the point below which the volume of errors would still require a level of manual review that made the cost savings disappear.

Those two numbers changed the architecture before the architecture existed. When we began evaluating language models for the extraction task, the cost-per-unit ceiling immediately narrowed the field. Several capable hosted models were ruled out, not because they performed poorly, but because their inference costs at the expected processing volumes could not meet the ceiling. The evaluation shifted to open-source models, including Mistral and Qwen, that could be self-hosted on hardware priced to meet the constraint.

The accuracy target shaped the development process in different ways. Achieving 93% to 94% accuracy through initial fine-tuning was relatively quick. Reaching 97% required a more layered approach: splitting invoices into segments before extraction, running multiple model passes and combining the outputs, and ultimately building a mechanism to retrieve historical invoices from the same provider and use that context to inform the current extraction. Each of those steps was a direct response to a defined target, not an open-ended effort to improve the model. The business case gave the team a finish line, and the engineering work was organized around reaching it.

Why the business case also reveals what "done" looks like

One of the less obvious functions of a pre-development business case is that it defines the endpoint. Without it, AI projects tend to expand into indefinite optimization cycles, because there is no agreed standard for when the system is good enough to replace what it was meant to replace.

When the business case specifies that the existing third-party provider costs a defined amount annually, that a manual review team can be reduced by a specific headcount, and that the new system needs to hit a defined cost and accuracy threshold to make those reductions possible, the team has a finish line. Progress is measurable. The decision to move to production, or to keep iterating, is grounded in evidence rather than engineering preference.

This also changes how teams handle the distance between early performance and the target. A system reaching 97% accuracy when the target is 99% is not nearly done; it is at a meaningful decision point. The team must evaluate whether the remaining gap can be closed through additional fine-tuning, architectural changes such as splitting the document into segments before extraction, or contextual enrichment from historical data. Each of those approaches has a cost and a timeline. The business case gives the team a framework for making that evaluation rather than simply continuing to iterate.

Where business cases most commonly fail

Even when teams attempt a pre-development business case, several failure modes recur.

The displacement calculation is optimistic. Teams calculate the cost of the vendor or process being replaced, but underestimate the transition costs, integration overhead, and ongoing maintenance required to sustain the AI system. The net saving looks attractive in the model and disappoints in the first year of operation.

The accuracy target is set by intuition rather than process analysis. A target of 97% accuracy sounds rigorous, but it may be too low if the downstream process requires manual correction of every error, or unnecessarily high if errors are caught cheaply by a downstream validation step. The right accuracy target comes from mapping the process, not from picking a number that feels ambitious.

The cost-per-unit ceiling ignores infrastructure at scale. A system that meets its cost target during testing, when processing hundreds of documents, may blow past it at production volumes of hundreds of thousands. Architectural decisions that are cheap at small scale, model size, API calls, redundancy requirements, behave differently at volume. The business case needs to model the cost curve, not just the unit cost at the expected initial throughput.

The business case treats compliance and security as outside scope. In regulated environments or where data sensitivity is a constraint, the business case must include the cost of operating within those constraints. Deploying a self-hosted model to avoid sending credentials or sensitive data to a third-party API is a business requirement, not a technical preference. The cost of that requirement should be included in the model from the beginning.

Building the case before the code

The practical implication of all of this is straightforward: the business case should be completed, challenged, and agreed upon before the technical evaluation begins, not in parallel with it.

That sequencing is harder than it sounds. There is always pressure to start building something while the business case is still being developed. Proof-of-concept work can feel productive and create momentum. But a proof-of-concept that has not been scoped against a cost ceiling or an accuracy threshold is likely testing the wrong thing. It validates technical feasibility without establishing commercial viability, and those are different questions.

The teams that manage AI investments well tend to treat the business case as the first engineering artifact of the project. It constrains the solution space before the solution is designed. It sets the evaluation criteria before the evaluation begins. And it defines what the project needs to deliver before anyone decides how to deliver it.

That discipline is what separates AI projects that compound value from those that compound cost.

Navigate AI adoption with our assistance

If you want to understand whether AI can strengthen your architecture or whether it would amplify existing issues, we can help you assess that.

Talk to our team

Summarize with AI

Published by Stefan Sarbu

6 Apr 2026

Last edit: 10 Jun 2026

Share this post via:

Stefan Sarbu - 27 May 2025

Market validation tactics for pre-seed and seed tech startups

Every startup begins with a hypothesis: there’s a real problem, and you have the right solution. But you haven't validated anything until someone outside your team cares enough to engage, sign up, pay, or even just respond. Still, this step often gets overlooked. Early-stage founders frequently jump straight into building, pitching, or scaling, driven by momentum or pressure to show progress. Validation becomes an afterthought, something to deal with once there’s a product or a few users.

Idea to MVP Featured Startup

Tudor Iordache - 12 May 2025

Foundation Models Are Redefining Lean Prototyping for Tech Startups

Startups are under pressure to move fast and prove value early. Investors want to see working demos before they commit. Users expect polished experiences even in beta. And founders, often working with lean teams and tight budgets, need to bridge the gap between idea and execution without burning through their runway. This is where foundation models come in. These large-scale AI systems, pre-trained on diverse data and capable of handling language, image, and multimodal tasks, dramatically lower the technical barriers to building a prototype.

Technology Idea to MVP Featured Startup