SafeDisk AI

Container Registry Backing Store Full 507

A registry can still answer reads while being unable to accept a single blob byte. Treat local ENOSPC, S3 quota exhaustion, and blocking storage backends as an explicit write-health failure, not an opaque 500 or a healthy `/v2/` response.

Free registry gate

Separate read health from write health before routing pushes to the registry.

Use this when blob uploads hang, pushes return `500 UNKNOWN`, S3 or Ceph quotas are exceeded, or a Kubernetes registry pod remains ready while its backing store cannot write.

read health is not write health; backing-store full should produce 507 or fail readiness
Need $99 registry policy Read-only evidence Open runbook $99 reusable policy
Read-only evidence

Capture status code, backend error, readiness behavior, and upload debris.

These checks do not need credentials, image layers, private repository names, or registry secrets. The useful signal is whether write failure is mapped, bounded, and routed out of service.

push result -> backend error -> write probe -> readiness -> upload cleanup
Request $99 registry policy Request $29 incident read

Runbook: A Readable Registry Can Still Be Write-Dead

  1. Do not use `GET /v2/` or read-only storage stat as proof that pushes can succeed.
  2. Translate full-store errors into a stable operator-facing status: local `ENOSPC`, `EDQUOT`, S3 quota exceeded, and object-store capacity errors should not collapse into generic 500.
  3. Add a write-aware readiness probe that initiates a tiny upload against localhost and expects the normal upload-start response within a strict timeout.
  4. If the backend can block writes instead of returning errors, add server-side deadlines or readiness timeouts so upload connections do not remain established indefinitely.
  5. When write readiness fails, return 503 from the main listener or remove the instance from the load balancer for push traffic.
  6. Keep probe upload debris bounded with upload purging and a probe repository that cannot hide real tenant data.
  7. Document the safe cleanup boundary: upload sessions, temp files, logs, and garbage-collection steps are different from manifests and referenced blobs.
Copy-ready issue reply

Use this when a full backing store appears as 500 or a hang.

This keeps the fix portable across registry versions and backends: map full-store errors, fail write-readiness, and bound blocking writes.

I would frame this as a registry write-health contract, not only a v2.x status-code gap.

Acceptance checks I would want:
- Full local stores map ENOSPC/EDQUOT to a stable 507 Insufficient Storage response instead of UNKNOWN/500.
- Object-store or S3 quota errors map to the same operator-facing class.
- Read-only health remains separate from write health; GET /v2/ can stay green while push readiness fails.
- A write-aware readiness mode initiates a tiny upload and fails within a bounded timeout if the backing store cannot accept writes.
- Blocking backends get request/write deadlines so upload sockets do not stay established forever.
- Probe upload debris is bounded by upload purging or a dedicated probe repository.
- Docs say which upload/session files are safe to purge and which blob/manifest paths are not.
Request policy review
Paid scope

Turn one registry incident into a reusable backing-store-full policy.

The $99 policy is for self-hosted OCI registries, Kubernetes image registries, CI image caches, S3-backed registries, and Ceph-backed filesystem registries where reads can stay healthy while writes fail.

No credentials, image layers, repository-private names, or registry secrets. A public-safe symptom is enough to start.

Do Not Delete First