Nix GC EBUSY Runner Disk Full Recovery

When a self-hosted Nix CI runner leaks stale patched bind mounts, nix-collect-garbage can abort on the first EBUSY store path and free nothing. Treat that as a runner policy incident: prove which mounts are stale, preserve active build roots, then make GC resilient before the root filesystem fills again.

CI deletion judgment

Find what the runner can delete.

Leave your email now; the scan summary or failing job link can follow after the first reply. We send the free SafeDisk AI deletion trial step only if deletion risk is still unclear.

Run free GitHub Action See sample result Ask AI about one file

Runbook: Break The GC Abort Loop

Stop treating the full disk as a generic cache issue. If GC aborts on EBUSY, every normal reclaim path may free zero bytes until the stale mount boundary is resolved.
Capture mount evidence from /proc/1/mountinfo, not just mount output. Preserve source and target paths for every .patched. mount.
Check GC roots and active builder processes before unmounting. A lazy unmount is acceptable only after the mount is stale or the build owner has been stopped.
Run GC again after unmounting stale patched mounts and record bytes/paths freed. The before/after proves whether EBUSY was the blocking condition.
Add a startup guard for persistent runners: if patched mounts older than a threshold exist, alert and quarantine the runner before new builds consume the remaining root filesystem.
Add acceptance tests for cancellation: cancel a FOD patching build, restart the daemon, rerun GC, and verify GC skips or reaps stale state instead of returning zero progress.

Copy-ready issue reply

Use this when GC aborts with EBUSY and disk keeps filling.

This keeps the discussion focused on the operational invariant: stale patched mounts should not make every later GC reclaim zero bytes.

I would treat this as a GC progress failure, not just a disk cleanup problem. If one stale patched bind mount makes nix-collect-garbage abort with EBUSY, root can fill even though most store paths are otherwise reclaimable.

The useful recovery packet is:

df -hT / /nix /tmp
df -i / /nix /tmp
grep -E "patched\\." /proc/1/mountinfo
nix-store --gc --print-roots | head -200
journalctl -u nix-daemon --since "24 hours ago" | grep -E "EBUSY|patched|garbage|ENOSPC|No space" | tail -200

Then lazy-unmount only patched mounts proven stale or tied to a canceled build, rerun GC, and record bytes/paths freed. For recurrence, I would add a daemon/runner-start guard that alerts on old .patched.* mounts before the runner accepts more work.

Do Not Delete First

Active build outputs, live GC roots, and current builder work directories.
Nix store paths before confirming whether EBUSY is caused by a stale mount rather than real ownership.
Runner diagnostic logs that prove the original ENOSPC, EBUSY, cancellation, or daemon restart sequence.
Docker volumes, cache stores, or workspace state on persistent runners until ownership and rebuildability are clear.

Free AI deletion trial

Need a delete / confirm / protect answer for this runner?

Submit the form first; the failing job link can follow. We check whether free guidance is enough before asking for the free SafeDisk AI deletion trial.