SafeDisk AI

Artifact Cache Write Failure Connection Leak

When a package proxy streams an upstream response to the client and into a cache at the same time, a disk-full or S3 write failure must not leave the main request blocked until `WriteTimeout` while both client and upstream connections stay open.

No credentials, package contents, repository-private names, or customer artifacts. A public-safe symptom is enough to scope the policy.

$99 cache failure policy

Turn one cache-write hang into a reusable artifact proxy contract.

Use this when npm, PyPI, Cargo, Maven, Go module, or artifact proxy handlers tee upstream bytes into a disk/S3 cache while serving the client.

cache failure must unblock the response path, not hold two connections until timeout
Read-only evidence

Capture stream topology, cache backend, timeout behavior, and connection pressure.

These checks do not need package contents, repository secrets, credentials, or customer artifacts. The useful signal is where the cache writer exits and whether the response path unblocks.

tee path -> cache backend -> error injection -> unblock time -> held connections
Request $99 cache policy Request $29 incident read

Runbook: Cache Failure Must Not Own The Request

  1. Do not let cache persistence be a hidden hard dependency for serving an upstream artifact response.
  2. When the cache writer exits early, explicitly drain or close the cache-side reader so the tee writer does not block forever.
  3. Use `CloseWithError`, context cancellation, or equivalent signaling so the response path knows the cache side is gone.
  4. Set deadlines for cache writes and object-store puts that are shorter than the client-facing `WriteTimeout` failure mode.
  5. Test disk-full, quota-exceeded, short-write, and object-store failure paths with multiple concurrent artifact requests.
  6. Verify the error path releases client connections, upstream connections, goroutines, temp files, and file descriptors promptly.
  7. Write cache objects atomically and remove partial temp files so a failed cache write cannot poison future reads.
Copy-ready issue reply

Use this when cache-write failure blocks the streaming handler.

This keeps the fix measurable: cache failure is injected, the response path unblocks promptly, and connection pressure is bounded.

I would turn this into a cache-failure contract for every artifact handler, not just a local fix in one backend.

Acceptance checks I would add:
- Inject ENOSPC/EDQUOT or S3 PutObject failure after some bytes have already streamed.
- The client-facing handler returns promptly; it should not wait until WriteTimeout.
- The upstream response body is closed promptly on cache failure or client abort.
- The cache side drains or closes the pipe reader so the tee writer cannot block forever.
- Partial temp files are removed and never become readable cache entries.
- A concurrent test proves failing cache writes do not hold client connections, upstream connections, goroutines, or fds until timeout.
- The same behavior is covered for npm, PyPI, Cargo, Maven, and Go module handlers if they share the pattern.
Request policy review
Paid scope

Turn one cache-write failure into a reusable artifact proxy policy.

The $99 policy is for package mirrors, artifact proxies, build caches, CI dependency caches, and S3-backed cache services where disk-full or object-store errors can block request handlers and exhaust connections.

No credentials, package contents, repository-private names, or customer artifacts. A public-safe symptom is enough to start.

Do Not Treat As A Cache Miss