AI Agent tmpfs Artifacts Disk Full
Agent tools often write frame dumps, wrapper logs, launch tempfiles, capture files, hooks, and worktree scratch under /tmp. When that path is a RAM-backed tmpfs, a cleanup bug can make unrelated tests fail with ENOSPC even though the normal disk still looks healthy.
Get the safe tmpfs cleanup boundary before sweeping /tmp.
The safe fix is not a broad rm -rf /tmp/agent-*. First classify stale frame dumps, per-run temp, current session files, and logs that should move under a retained log root.
df /tmp -> list agent debris -> preserve recent/live files -> sweep stale artifacts
Measure tmpfs pressure, artifact age, and active handles first.
These checks report size, age, and open handles. They do not delete files or stop sessions.
df -h /tmp; find /tmp -maxdepth 2 -name 'aiur-*' ...
Runbook: Sweep Only Stale, Known, Owned Artifacts
- Confirm the failing filesystem is tmpfs, not the host root disk. A RAM-backed tmpfs can be full while
/has plenty of free space. - Measure by name, age, and owner. Agent cleanup should target known artifact shapes, not every file in
/tmp. - Protect current session files: recent launchers, wrapper PIDs, active tmux/socket files, hooks, and any file still held open by a live process.
- Move unbounded debug/frame output under a session log directory that already has retention. Do not tee high-frequency redraw output to tmpfs forever.
- Add a stale-artifact sweep around shutdown, stop, and explicit developer cleanup commands, using a bounded age threshold.
- Emit before/after disk snapshots so future failures can be attributed to tmpfs debris, log retention, or a separate test failure.
Use this when agent debug artifacts fill /tmp.
This keeps the fix review focused on safe cleanup boundaries, not broad deletion.
I would add one acceptance check around the cleanup boundary: prove that stale tmpfs artifacts are removed while current session files are preserved.
Read-only evidence:
TMP_ROOT=${TMPDIR:-/tmp}
df -h "$TMP_ROOT"
df -i "$TMP_ROOT"
find "$TMP_ROOT" -maxdepth 2 \( -name "aiur-*" -o -name "*agent*" -o -name "*frames.bin" \) -printf "%s %TY-%Tm-%Td %TH:%TM %p\n" 2>/dev/null | sort -n | tail -60
lsof +D "$TMP_ROOT" 2>/dev/null | sed -n "1,80p" || true
Safe policy:
- remove only known agent artifact names
- require same-user ownership
- require stale age threshold
- protect current session tempfiles and live handles
- move unbounded frame dumps under the retained session log root
Turn one tmpfs incident into a reusable agent cleanup policy.
The $99 policy is for agent frameworks, local dev tools, and CI/sandbox systems that create debug dumps, temp captures, wrapper logs, hooks, or worktrees. You get safe/review/do-not-touch boundaries, stale age thresholds, and operator-facing messages for one representative workflow.
Do Not Delete First
- Current session tempfiles, live wrapper PIDs, tmux sockets, hooks, and files still held open by a running process.
- Session log roots that retention already manages.
- Workspaces, active task output, credentials, or local state just because they share an agent prefix.
- Diagnostic evidence proving which artifact class filled tmpfs.