SafeDisk AI

File Lock Heartbeat Disk Full Stale Lock

When a file-lock heartbeat silently fails on ENOSPC, inode exhaustion, permissions, or a missing heartbeat path, the lock holder may still be inside the critical section while another process decides the lock is stale and enters too.

Free browser cleanup

Find the biggest storage culprit first.

Run the Chrome or Edge web scan, delete one approved low-risk item free, then use the $29 Deep Cleanup only if meaningful space remains.

Start free cleanup $29 Deep Cleanup

First Response Runbook

A heartbeat failure should not be treated as a successful lock refresh. It should create an explicit lock-health state that downstream stale-lock logic can reason about.

  1. Log heartbeat refresh failures with the lock path, heartbeat path, operation, and filesystem error.
  2. Classify ENOSPC, EDQUOT, EIO, EACCES, EPERM, and missing heartbeat paths separately from normal stale timeout.
  3. When heartbeat refresh fails, decide whether the holder aborts protected work, releases the lock, or marks the lock as unhealthy.
  4. Do not let a contender steal only because mtime is stale when disk or permission failure could explain the stale heartbeat.
  5. Require dead-owner evidence, an explicit fencing token, or a recovery lock before allowing a steal.
  6. Add a two-contender regression test: holder heartbeat fails, contender polls, and both processes never enter the critical section at once.
Copy-ready issue reply

Use this checklist when a heartbeat error is currently swallowed.

It keeps the fix focused on preventing concurrent entry, not only printing a warning.

I would make the heartbeat failure visible, and I would also define what happens to the protected critical section once the heartbeat path becomes unhealthy.

Acceptance checks I would add:

- Inject utimes(heartbeatPath) failure with ENOSPC/EIO/EACCES and assert a warning includes the lock path and operation.
- Treat heartbeat-write failure as lock-health degradation, not a silent success.
- The holder should either abort protected work or mark the lock as non-stealable until ownership is resolved.
- The stale-lock detector should require both stale mtime and dead owner/process evidence before stealing.
- Add a two-contender regression test: holder heartbeat fails, second process polls, and concurrent entry never happens.
- Surface lock-dir filesystem and inode status so disk-full and permission failures are distinguishable.
AI CLI disk-full guide

Evidence To Collect

Paid Scope

The $29 incident triage reviews one lock or runner failure and returns the safest next diagnostic step. The $29 deep cleanup turns one representative incident into a stale-lock policy, failure taxonomy, and regression checklist for your agent, CLI, or CI tool.

Deep Cleanup

Still full after the free cleanup?

Send your email once. We reply with the $29 payment link, one clarification, or a no-pay answer if the free cleanup is enough.

Start free cleanup first