MeshanicsDocs
Rollouts

Waves & canary

A rollout does not hit every device at once. Its strategy divides the target group into ordered waves, each a cumulative percentage of the group. The first wave is your canary; later waves widen the change only after the ones before them have proven healthy. If something goes wrong early, the blast radius is a handful of devices, not the fleet.

Cumulative waves

A strategy is a list of waves, each carrying a cumulative percentage. The percentages must strictly increase and the final wave must reach 100%. A classic canary plan looks like this:

{
  "waves": [
    { "percent": 1,   "approval": "auto" },
    { "percent": 10,  "approval": "auto" },
    { "percent": 50,  "approval": "manual" },
    { "percent": 100, "approval": "auto" }
  ]
}

Wave 0 covers the first 1% of the group; wave 1 brings the running total to 10%, and so on. A wave targets only the devices between the previous wave's cut and its own. If you don't supply a strategy, the rollout runs as a single wave to 100%.

Deterministic device assignment

Devices are partitioned into waves in a stable order, so the same group always produces the same canary cohort. The split is computed once when the rollout is created and stored with it. Wave 0 receives its assignments on each device's next heartbeat; later waves receive theirs as the gates ahead of them clear.

The agent pulls its assignment over the existing device channel — there is no separate connection to open and nothing for you to integrate. The platform serves the assignment; the device verifies, applies, probes and reports.

Progression and gates

A wave is complete when every live device in it has reached a terminal state. What happens next depends on the next wave's approval mode:

  • auto — the rollout advances to the next wave on its own, provided the completed waves are within the failure tolerance.
  • manual — the rollout parks in a waiting for approval state. It advances only when an operator with approval rights approves it. This is your gate before a wide blast radius — for example, before going from 50% to the whole fleet.

Approving a parked rollout releases the next wave immediately; if that wave turns out to be trivially complete (for instance, rounding produced no new devices), the rollout keeps advancing without further prompting.

Throughout, you watch waves land in real time: per-device state, the current wave, and how many devices are healthy so far.

Direction: model A/B experiments

Wave-based canarying answers "is this update safe to widen?" A separate capability on the roadmap answers "is this model variant better?" — assigning model variants across a cohort, collecting the metrics your application already emits, and comparing variants before you promote one. It builds on the same rollout machinery described here.