Executive Resources · for UK SME leaders
How to Start an AI Pilot Properly
Most AI pilots fail before they have a chance to prove anything, and the failure is rarely technical. It is structural: wrong scope, no baseline measurement, no executive authority to remove blockers, and staff who were not involved in the design and do not trust the output. An eight-week pilot in three phases — design, execute, evaluate — with one executive sponsor, one specific use case, a measured baseline, shadow-mode validation, a graduated ramp and explicit go/no-go criteria, prevents all of those failures.
Why most AI pilots fail before they begin
The post-mortem on a failed AI pilot almost never lands on the model. It lands on scope that was too broad, a measurement baseline that was never taken, a sponsor who was not senior enough to clear blockers, and an end-user team that found out about the project the week before go-live.
Eight weeks of structure prevents all of that. The design phase commits the organisation to a single use case with a measured baseline and a real executive sponsor. The execute phase de-risks the AI by running it in shadow mode before it touches live work. The evaluate phase produces a defensible go or no-go decision, not a vibes-based extension into month four.
Phase 1: Design (weeks 1–2) — get executive sponsorship first
An AI pilot without an executive sponsor is a project waiting to stall. When a blocker appears — a budget approval, a scope dispute, a systems-access issue — someone needs the authority to resolve it in days, not months. That person must be a C-suite or SVP-level owner who meets with the pilot team at least monthly, has budget authority and will report results to the board.
Without this, the pilot will lose momentum the moment things get hard. And things always get hard.
Phase 1: Define one specific use case and measure the baseline
Good pilot use cases are narrow and well-defined: a single process, a clear input and output, a measurable outcome. Bad pilots try to explore what AI can do across a broad domain. You do not learn from exploration — you learn from a specific question with a specific answer.
A well-defined pilot reads like this: 'Automatically categorise incoming customer support tickets as urgent, high, medium or low to reduce manual sorting time from two hours to fifteen minutes daily.' Everything else is out of scope.
Before touching any AI tool, document the current state of the process: volume per day or week, time per item, error rate, cost per transaction and where the bottlenecks actually are. That is your before measurement. Without it, the after measurement means nothing.
Phase 2: Execute (weeks 3–6) — run in shadow mode first
Do not replace the human process on day one. Run the AI in parallel for the first two weeks. Staff perform the process as normal. The AI also processes the same inputs. Compare the outputs side by side.
- It measures real accuracy on your actual data before anything is at risk.
- It builds staff confidence as they see the AI perform accurately rather than just being told it is accurate.
- It surfaces edge cases and failure modes before they affect live work.
Phase 2: Ramp gradually and track daily
Once shadow mode validates accuracy, increase AI-handled volume in stages: 20% in week three, 50% in week four, 80% in week five and 100% in week six with a manual fallback still available. Monitor accuracy and gather staff feedback at each stage. Do not accelerate the ramp if accuracy is below target or staff confidence is low.
A daily dashboard does not need to be sophisticated. A shared spreadsheet updated each morning showing volume processed, AI accuracy, manual overrides, time saved and any error patterns is enough. The point is that you see trends as they develop, not after they have become problems.
Phase 3: Evaluate (weeks 7–8) and run the go/no-go
Compare the post-pilot state to your baseline on every metric you measured at the start. Calculate actual ROI: time saved per week multiplied by staff cost, minus the tool cost per year. Get staff to complete a short satisfaction survey covering confidence in the AI, what still needs improvement and whether they would recommend extending it.
The criteria for a Go decision are explicit: accuracy at or above 80%, measurable time or cost savings, staff confidence at 7 out of 10 or higher, no unresolved governance concerns, and the sponsor wants to scale. If any of those criteria are not met, the right answer is either to fix the specific problem and rerun the pilot, or to wind it down and apply the learning to a different use case.
Eight weeks is enough. A pilot that drags on for six months is not a pilot — it is an unmanaged project.
Common pilot pitfalls
Six failure patterns recur across the pilots that do not deliver. Each has a single, cheap prevention. None of the preventions are clever; all of them are skipped under time pressure.
| Pitfall | Impact | Prevention |
|---|---|---|
| No baseline measurement | Cannot prove ROI | Measure current state before AI |
| Multiple use cases at once | Diluted resources, no clear learning | One process per pilot |
| No executive sponsor | Pilot stalls when blocked | Confirm sponsor commitment before starting |
| End users not involved | Low adoption and trust | 20% of pilot team are operational staff |
| No fallback procedure | One failure breaks everything | Manual backup always ready |
| Wrong success metrics | Cannot demonstrate value | Define metrics before pilot, not during |
Take the next step
Want help applying this to your organisation? Use the resource below or book a 30 minute strategy call with Simon — no pitch, just practical advice.
Frequently asked questions
Related resources
Executive Resources
AI Engagement Framework
The three-phase AI delivery framework AI-Si.com uses with UK organisations: assessment, implementation, optimisation — with deliverables, governance and metrics.
Executive Resources
30-Day AI Quick Wins
A four-week framework for UK SME leaders to identify, build and launch their first AI automation, with real ROI evidence ready by day thirty.
Executive Resources
Why AI Implementations Fail
73% of AI projects fail to reach production. Five preventable failure patterns and the success markers that distinguish projects that actually deliver.
Find Out Where AI Can Save or Generate Money in Your Organisation
Book a free 30-minute call with Simon. Bring a real problem — staff time, governance worry, vendor proposal, failing pilot — and leave with a concrete first step you can take next week.
