SafeDisk AI

Linux /var/lib Not Writable Service Crash Loop

When a service needs to create a state file under /var/lib but systemd sandboxing, ownership, read-only mounts, or ENOSPC blocks the write, the correct fix is a guarded error path plus a packaging-level writable-state check.

Free cleanup decision

Get a cleanup decision before you pay.

Leave your email now. The scan summary can follow after the first reply; we offer the $29 Deep Cleanup only if it is useful.

Start browser scan

Read-Only Evidence

Do not delete state directories first. Prove which layer prevents writes.

svc=netdata
state_dir=/var/lib/netdata

systemctl cat "$svc"
systemctl show "$svc" -p User -p Group -p StateDirectory -p ReadWritePaths -p ReadWriteDirectories -p ProtectSystem -p ProtectHome
namei -om "$state_dir"
stat -c '%U:%G %a %n' "$state_dir" 2>/dev/null || true
df -h "$state_dir" /
df -i "$state_dir" /
mount | grep -E ' /var | /var/lib | / '
journalctl -u "$svc" -n 120 --no-pager | grep -Ei 'read-only|permission denied|no space|enospc|erofs|segv|null|state|var/lib'

Safe Fix Boundary

  1. Add a null/error guard in code wherever state-file creation can fail. The API should return a controlled error or degraded response, not crash.
  2. Treat EROFS, EACCES, ENOSPC, and missing parent directories as separate test cases.
  3. For packages, use systemd state management deliberately: StateDirectory=, ReadWritePaths=, or the narrowest writable path required by the service.
  4. Add a startup or health warning that says the configured state directory is not writable before the first user-facing request hits the crash path.
  5. Keep cleanup separate from packaging. Freeing disk can fix ENOSPC, but it does not fix a read-only mount or missing writable path in the unit file.
Copy-ready issue reply

Use this when a service crashes because its state directory is not writable.

The goal is to keep review focused on deterministic failure handling and packaging validation.

I would add tests at two layers:

1. Code path: state/session file creation returns EROFS, EACCES, ENOSPC, and missing-parent errors. The API should not crash, and it should not pass NULL into formatting/path helpers.
2. Package path: the shipped systemd sandbox can create the exact state file the agent expects under the configured varlib/state directory.

Read-only operator evidence:
- systemctl cat/show for ProtectSystem, StateDirectory, ReadWritePaths/Directories
- namei/stat for the state directory ownership and mode
- df -h and df -i for the state directory
- one namespace write probe only if needed, using a harmless temp name

The fix boundary should be: make the directory writable for the service, report a controlled degraded state when it is not writable, and never delete live state as the first recovery step.

Do Not Delete First

Deep Cleanup

Still full after the browser scan?

Start with the browser scan. If the scan shows review-first storage that still needs judgment, send one request for the $29 Deep Cleanup next step.

Start free scan