Waves & canary
A rollout does not hit every device at once. Its strategy divides the target group into ordered waves, each a cumulative percentage of the group. The first wave is your canary; later waves widen the change only after the ones before them have proven healthy. If something goes wrong early, the blast radius is a handful of devices, not the fleet.
Cumulative waves
A strategy is a list of waves, each carrying a cumulative percentage. The percentages must strictly increase and the final wave must reach 100%. A classic canary plan looks like this:
{
"waves": [
{ "percent": 1, "approval": "auto" },
{ "percent": 10, "approval": "auto" },
{ "percent": 50, "approval": "manual" },
{ "percent": 100, "approval": "auto" }
]
}
Wave 0 covers the first 1% of the group; wave 1 brings the running total to 10%, and so on. A wave targets only the devices between the previous wave's cut and its own. If you don't supply a strategy, the rollout runs as a single wave to 100%.
Deterministic device assignment
Devices are partitioned into waves in a stable order, so the same group always produces the same canary cohort. The split is computed once when the rollout is created and stored with it. Wave 0 receives its assignments on each device's next heartbeat; later waves receive theirs as the gates ahead of them clear.
The agent pulls its assignment over the existing device channel — there is no separate connection to open and nothing for you to integrate. The platform serves the assignment; the device verifies, applies, probes and reports.
Progression and gates
A wave is complete when every live device in it has reached a terminal state. What happens next depends on the next wave's approval mode:
auto— the rollout advances to the next wave on its own, provided the completed waves are within the failure tolerance.manual— the rollout parks in a waiting for approval state. It advances only when an operator with approval rights approves it. This is your gate before a wide blast radius — for example, before going from 50% to the whole fleet.
Approving a parked rollout releases the next wave immediately; if that wave turns out to be trivially complete (for instance, rounding produced no new devices), the rollout keeps advancing without further prompting.
Throughout, you watch waves land in real time: per-device state, the current wave, and how many devices are healthy so far.
Direction: model A/B experiments
Wave-based canarying answers "is this update safe to widen?" A separate capability on the roadmap answers "is this model variant better?" — assigning model variants across a cohort, collecting the metrics your application already emits, and comparing variants before you promote one. It builds on the same rollout machinery described here.