Deployment — RAI Swarms

00 — How deployments die

Six Common Ways a Program Quietly Ends.

None of these are about the robot failing. They are about the deployment failing around the robot. Each one is a real pattern we have seen end real programs.

D · 01 · Heroics tax site · 1

The Engineer Becomes the Runtime.

One specialist is on site for four months. The pilot looks healthy. The day they fly home, overrides triple. There is no second site because there is no second engineer.

€420k · engineering · per site · annualised

D · 02 · Trust collapse floor · supervisor

Three Unexplained Pauses and the Supervisor Disables It.

Trust on a floor is binary. After three silent pauses, the platform is off; recovering trust takes weeks of operator-relations work, not engineering work.

2–6 wk · re-onboarding · per shift team

D · 03 · Site-2 stall CFO · review

Site Two Costs What Site One Cost.

No cross-site reuse, no inherited memory, no shared operator interface. The economic case dies at the budget review and the program quietly ends.

68% · pilots that never reach site 2

D · 04 · Vendor blame loop corridor · 14

Two Vendors Blame Each Other for Every Deadlock.

Vendor A logs say B was wrong. Vendor B logs say A was wrong. The CFO asks what they are paying for. The platform stays parked while contracts are renegotiated.

€18–€60/min · throughput loss · per stall

D · 05 · Override fatigue last hour of shift

Operators Get Tired. The Platform Does Not Notice.

Override rate doubles in the last hour. Supervision quality collapses. The same failure repeats Monday because no override was learned from.

+22% · cost per exception · last quartile

D · 06 · Silent drift week · 6

Performance Slips 0.3% Per Shift — Until It’s 78%.

No single event triggers alarm. The drift is the failure. By month two, the program runs at three-quarters of pilot and nobody can say when it slipped.

€0 saved · cumulative · year one

Patterns assembled from cross-vendor deployment reviews 2023–2025. Indicative only.

01 — What actually breaks

Six Conditions That Appear on Every Floor.

Not benchmark gaps. Specific, repeatable conditions that exist on every site and that the robot has to operate through, not around.

F · 01

Blocked Corridors.

A pallet jack parked across a thoroughfare. Two units route around it; the third deadlocks. Recovery time is the metric, not detection.

F · 02

WiFi Degradation Under Load.

Coverage looks fine at install. By Tuesday afternoon, the warehouse network is saturated and the platform’s coordination latency triples.

F · 03

Human Workflow Drift.

A team-lead adds a hand-off step on Monday morning. The robot is still operating against last week’s workflow on Tuesday.

F · 04

Map Divergence.

The world has moved. The map has not. Localisation reports nominal; planning is now wrong. The first symptom is a small idle-time creep.

F · 05

Inventory Mismatch.

WMS says bay 14. Bay 14 is empty. Real inventory sits at bay 17 because a picker re-shelved. The robot picks nothing and times out.

F · 06

Unexpected Operator Behaviour.

Operator stops the robot mid-task to ask a question, walks away, never resumes. The platform must understand the world has paused, not failed.

02 — Pilot vs deployment

Different Problem. Different Physics.

A pilot can be designed around. A deployment must be designed for. The conditions are not the same, and most platforms are still optimising for the wrong one.

Pilot

Scripted. Supervised. Optimised Conditions.

·Pre-mapped, pre-cleared, pre-staged environment.
·Engineering team on-site, sometimes mid-task.
·Exception list known in advance.
·One robot, one corridor, one workflow.
·Success means making the demo land.

days–weeks Time horizon

Deployment

Continuous. Variable. Operational Pressure.

·Real shifts, real workflows, real interruptions.
·Operators supervise from a console, not the floor.
·Exception list never closes — workflow keeps moving.
·Multi-platform fleet, multi-site coordination.
·Success means operating reliably for twelve months.

months–years Time horizon

03 — Anatomy of a stalled pilot

A Real-Shape Postmortem.

Composite from three programs we reviewed. Nothing here is engineered — it is what actually happens between the demo and the budget call.

Week0

Pilot lands. Demo is excellent.

Platform handles the rehearsed flow. Stakeholders attend. Approval expands to site two. internal momentum · high

Week3

Exception rate doubles in real shifts.

SKU mix changes weekly. The platform was tuned to one. Engineering team commits to a “quick patch” that lives for the rest of the program. + 2 FTE · indefinitely

Week5

Override storm during night shift.

Supervisor disables two units after three unexplained pauses. Trust is now binary on that floor — and binary trust is hard to repair. throughput −18% · ongoing

Week8

The engineer becomes the runtime.

One specialist holds the program together. Their calendar is the platform’s recovery policy. There is no documented escalation discipline. single point of failure · named

Week12

Vendor blame loop opens.

Two AMR platforms deadlock in corridor 14. Vendor A blames Vendor B. Procurement is asked to choose. The site stops trusting either. contracts re-opened · 6 weeks

Week18

Site two delayed indefinitely.

CFO review. No reused integration, no inherited memory, no operator interface that travels. The economic case for site two does not survive the meeting. program · soft cancellation

Week26

Quietly written off. No second site.

The platform stays on site one. The capability is real; the runtime never was. Future programs are funded with less ambition. future ambition · reduced

The runtime layer addresses every one of these rows · explicitly · by design.

04 — The integration wall

The Hard Problem Is Not Isolated Intelligence.

“The vendor blamed our wiring. The integrator blamed the vendor. The CFO asked what we were paying for. Six weeks later, we’re still parked.”

Programme leadLogistics · UK · site 2 review

Many robotics systems pass demos but fail at deployment because the hard problem is not isolated intelligence. It is integration — with real environments, workflows, operators, safety constraints, edge cases, maintenance schedules, and business operations that nobody planned for at pilot time.

Every site is its own integration. Without a runtime, that integration is rebuilt from zero on every floor. With one, it compounds — and only then does the second site cost less than the first.

05 — Deployment readiness

Autonomy Is a Spectrum, Not a Switch.

Six levels from controlled demo to fleet-level adaptive intelligence. The Cognitive OS is what carries a platform across the levels — not a single deployment, but the path itself.

Level 0

Controlled Demo

Scripted environment, engineering supervision, success conditions hand-tuned. Useful for trust-building, not for operations.

Level 1

Scripted Pilot

Fixed sequence in a semi-controlled site. Exceptions still require engineering presence. Most public robotics videos sit here.

Level 2

Assisted Field Operation

Real environment, real workflow, but with active supervision. The robot does the task; humans handle every deviation.

Level 3

Human-Supervised Autonomy

The robot handles routine and most exceptions. Operators supervise from a console and intervene on escalation.

Level 4

Multi-Site Operational Reliability

The same runtime, the same fleet posture, across multiple sites. Site-specific knowledge carried by memory, not by re-engineering.

Level 5

Fleet-Level Adaptive Intelligence

Cross-unit learning, cross-site adaptation. Every deployment improves the next. The platform compounds instead of repeating.

06 — Deployment telemetry

Operating Signals, Not Benchmark Scores.

A representative live operational picture. These are the signals an operations console shows on screen — and the ones the runtime uses to refine policy in the background.

T · 01 · Completion 94.3% ↑ +1.8 pts · 7 days

T · 02 · Intervention rate 3.1/ hr — flat · 7 days

T · 03 · Recovery latency 11.2s avg ↑ better · -3.4s · 7 days

T · 04 · Drift index 0.08 ↑ rising · watch

T · 05 · Operator confidence 0.87 ↑ +0.04 · 7 days

T · 06 · Fleet learning curve +18% / mo compounding

T · 07 · Escalation queue 02open — stable

T · 08 · Memory hit rate 71% ↑ inheriting fleet knowledge

Representative · operational signals · not a marketing chart

06b — Escalation routing

Who Gets Paged, by What, and When.

Escalation is wired into the runtime as a system, not as on-call discretion. The right context goes to the right human at the right confidence threshold — with the decision trail behind it.

Routine confidence routing Recovery / floor escalation Hard constraint · engineering Cross-vendor arbitration

Operator psychology · why this matters Supervisors disable platforms when escalations arrive without context, or when the same ambiguous alert fires three times. Routing is not a notification problem; it is a trust-management problem. The runtime treats it as such.

06c — Incident propagation

One Anomaly. Two Timelines.

The same fault under the same load. The difference between a contained incident and a shift-long cascade is whether a runtime is sitting underneath the platform.

Cascade · without runtime Contained · with runtime Human page-out Safe-hold transition

Net effect Same fault, same load. Without a runtime: 22 minutes, €1,180, one trust hit, no lesson captured. With one: 50 seconds, no human page, lesson written to fleet memory.

INCIDENT 042 · DC-NL-04 · 14:21:09 RECOVERED

unitR-07 / vendor-B

triggerconfidence drop · 0.41

causesensor occluded · forklift

recovery latency41 s [threshold 15 s]

contained att + 8 s · safe-hold

escalated tonone · auto-resolved

memory writepattern · cross-site policy

next occurrencehandled · < 5 s expected

SHIFT LOG · DC-NL-04 · T+12h excerpt raw

06:42:18vendor-B stall · corridor 14 · deadlock

07:01:55manual resolve · floor lead · trust-hit

09:14:02bin 8E reshelved · WMS diverges

11:38:47override R-03 · note: “don’t know”

14:21:09R-07 recovery 41 s · cascade contained

17:55:308E pattern auto-applied · no recur

22:00:00end shift · 6 events · 1 trust-hit · 2 captures

Pick rate · weeks 1–8 silent drift · no alarm fired

−22% vs week 1 · cumulative

07 — Deployment economics

The Unit Economics of Autonomy.

A deployment is judged by what it costs to integrate, what each exception costs to handle, and whether the cost-per-new-site is going down. The runtime is what bends the curve.

Cost driver	Without runtime	With Cognitive OS	Why
Engineering hours · new site	8–14 weeks	1–2 weeks	Site knowledge inherits via memory; integration is a config, not a project.
Engineering hours · new unit	2–3 weeks	1–3 days	New unit attaches to the runtime; the runtime already knows the floor.
Cost per exception (early)	$120–$300	$30–$60	Operator console + audit trail + runtime context, instead of a site visit.
Cost per exception (mature)	≈ unchanged	$5–$15	The fleet has learned the exception. It rarely escalates.
Scalability bottleneck	Engineering team	Operator coverage	The bottleneck moves from rebuilding to supervision — the right kind of cost.
Cross-site reuse	0–15%	60–85%	Memory transfer + shared runtime modules + standard operator interface.

Indicative ranges from real-world deployment programs. Exact economics depend on platform mix.

08 — Operational metrics

Operational Metrics, Not Benchmark Scores.

A deployment is judged by how it behaves on the floor — across shifts, environments, operators, and exceptions. Eight metrics define operational maturity.

/ 01

Task Completion Reliability

Share of tasks completed without engineering intervention, under real shift conditions.

/ 02

Intervention Frequency

How often a human operator must step in — and whether the trend is downward.

/ 03

Recovery Behavior

When something fails, what the system does next. Recovery is the real test of cognition.

/ 04

Operator Trust

Measured by override rate, escalation patterns, and qualitative supervisor signal.

/ 05

Environment Variability

How much of the operating envelope the system actually covers — not just the happy path.

/ 06

Memory Usefulness

Does the system get better the second week than the first? Does a new unit inherit the knowledge?

/ 07

Fleet Learning

Improvement curve at the fleet level, not the unit level. The thing that compounds across sites.

/ 08

Deployment Cost

Engineering time per new site, per new unit, per new exception. The unit economics of autonomy.

09 — Recovery intelligence

Every Override Is Labour. Every Override Is Also Memory You Never Captured.

The runtime treats recovery as a closed loop — detect, contain, escalate, resume, and feed the lesson back into policy. Captured once. Inherited everywhere.

01
Detect
Anomaly recognised against confidence threshold, drift band, or hard constraint — before the operator sees it.
02
Contain
Move to a stated next state. Slow down, secure, hold the work envelope. No silent fail-through.
03
Escalate with context
Route to the right human at the right threshold — with the trace, not the alarm.
04
Resolve
Operator decision or autonomous recovery. The decision is attributed and logged.
05
Resume
Pick up the work envelope where it paused. State is restored, not restarted.
06
Capture
The lesson enters memory. The next unit, next shift, next site inherits it. Recovery becomes infrastructure.

A deployment that requires heroics does not scale. A deployment whose lessons are not captured cannot stop requiring heroics. The runtime layer exists to close that loop.

Talk Deployment Where it applies

Most Deployments Die Between Site One and Site Two.