SafeDisk AI

MySQL Binlog Disk Full Service Outage

When MySQL binary logs fill a small VPS, the symptom is often much larger than "delete a few files": MySQL waits on ENOSPC, web workers back up, CPU climbs, and the application returns 502s. Capture the binlog evidence first, then choose a retention, replication, or template fix that will not break point-in-time recovery.

Free first pass

Prove whether binary logs are the outage driver before deleting anything.

The first pass should answer four questions: how full the filesystem is, how much space binlogs consume, whether replication/PITR depends on them, and whether the template should disable or expire binlogs for this deployment.

Copy read-only checks No email: GitHub $99 request Get $99 policy link Open runbook
Read-only evidence

Measure binlog growth without touching database state.

These checks are intentionally read-only. They show filesystem pressure, binlog count/size, current MySQL binary-log settings, and whether the service is configured as a single-node app database or a replication/PITR source.

df -h; du -sh /var/lib/mysql/binlog*; SHOW BINARY LOGS;
Request $99 template policy Request $29 incident triage

Runbook: Pick The Safe Binlog Fix

  1. Confirm the exact failure: MySQL writing a binlog.* file with OS errno 28, not generic Docker overlay, app logs, or inode exhaustion.
  2. Measure binlog space separately from the database directory. A single-node app can often use a different policy than a replication or point-in-time-recovery source.
  3. Check whether binary logs are required. If there is no replica, no CDC consumer, and no PITR backup process, the template may not need binary logging at all.
  4. If binlogs are required, set an explicit retention budget: seconds/days, max total bytes via monitoring, and an alert before disk reaches the write-failure cliff.
  5. If switching database engines or templates, treat it as migration work. Validate backups, import path, healthchecks, and resource settings before replacing the stateful service.
  6. Convert the incident into a reusable guard: no unbounded logs by default, a minimum free-space check, and a template review whenever upstream image defaults change.
Copy-ready issue reply

Use this when MySQL binlogs filled a VPS.

This reply keeps the discussion focused on the policy decision: disable binlogs for single-node templates, or retain them with explicit limits when replication/PITR needs them.

I would split the fix into immediate recovery and template policy.

Read-only evidence before deleting binlogs:

df -h
df -i
du -sh /var/lib/mysql /var/lib/mysql/binlog* 2>/dev/null | sort -h
mysql -e "SHOW VARIABLES LIKE 'log_bin'; SHOW VARIABLES LIKE 'binlog_expire_logs_seconds'; SHOW VARIABLES LIKE 'expire_logs_days'; SHOW BINARY LOGS;"
mysql -e "SHOW REPLICA STATUS\G" 2>/dev/null || mysql -e "SHOW SLAVE STATUS\G" 2>/dev/null || true

For a single-node app template with no replica, CDC, or point-in-time recovery flow, I would avoid unbounded binary logging by default: either disable it explicitly or set a short expiry plus a disk alert. For deployments that do need binlogs, the template should make retention visible and bounded so a small VPS cannot silently turn binlogs into a full service outage.
Request policy review
Paid scope

Turn one binlog outage into a safer template policy.

The $99 policy is for teams shipping one-click templates, self-hosted app stacks, or database-backed CI/dev services. You get the safe/review/do-not-touch cleanup boundary, retention settings, monitoring guard, and rollout checklist for one representative incident.

No secrets, database dumps, private logs, or credentials. A public-safe summary is enough to start.

Do Not Delete First