Redis: MISCONF Errors writing to the AOF file: No space left on device
Description
Redis Append-Only File (AOF) persistence fails due to disk exhaustion in the Persistent Volume (PVC). This issue causes Redis to stop accepting writes, leading to dependent services (e.g., GitLab KAS, Sidekiq) failing readiness probes.
Environment
-
Impacted offerings:
- GitLab Self-Managed (Kubernetes)
-
Impacted versions:
- Any version with AOF persistence enabled for Redis.
Cause
Redis uses AOF persistence to log every write operation received by the server. These logs help reconstruct the dataset during server startup. However, AOF persistence has certain drawbacks:
- AOF files are larger than RDB files for the same dataset.
- The append-only nature of AOF can cause significant storage consumption over time.
For example:
- Incrementing a counter 100 times will log 100 entries in the AOF file, even though only the final value matters. Redis optimizes this by rewriting logs periodically, but excessive writes can still cause rapid file growth.
Solution
There are several options available to resolve storage consumption.
1. Increase Persistent Volume Size
Resize the PVC to meet Redis's storage requirements for AOF persistence. Refer to the Redis documentation for disk sizing in heavy-write scenarios:
2. Manually Trigger an AOF Rewrite
Reduce the AOF file size without disrupting services by triggering a manual rewrite:
redis-cli BGREWRITEAOF
3. Optimize AOF Rewrite Settings (Config Change)
You can configure AOF rewrite thresholds to balance performance and disk usage.
Current Omnibus Defaults:
auto-aof-rewrite-percentage: 100
auto-aof-rewrite-min-size: 64mb
Example of Optimized Settings:
auto-aof-rewrite-percentage: 50 # Rewrite AOF when it grows by 50%
auto-aof-rewrite-min-size: 512mb # Trigger rewrite only after 512 MB growth
Additional Information
- AOF files grow significantly in write-heavy environments if proper thresholds are not configured.
- To prevent sudden disk exhaustion, monitor AOF growth regularly and ensure enough disk space is provisioned.
- In kubernetes-based installations, redis typically uses persistent volumes/persistent volume claims for storage. These volumes often have a fixed or limited size defined by the storage class or configuration, which makes this issue more common in kubernetes environments.