High CPU load for database server if project has a large number of pipelines
Issue
If the database CPU load is very high, it could be caused by the auto cancel redundant pipelines setting. This issue can affect the Sidekiq queue slowing down background jobs processing, causing side-effects such as Pipelines taking a long time picking up jobs, Merge Requests taking a long time to load and overall instance slowness.
Environment
-
Impacted offerings:
- GitLab Self-Managed
-
Impacted versions:
- Up to 17.4.2.
Cause
When auto-cancel redundant pipelines is enabled for a project, the CancelRedundantPipelinesService
uses a database query to find all pipelines created before a certain date. This query adds a lot of CPU utilization on the database server when a project has a large number of pipelines.
Resolution
To identify if this issue is the causation of the CPU load:
- Check the Sidekiq queue and utilization. Look for the
CancelRedundantPipelinesService
worker — it is usually long-running or will have large amounts jobs in the queue. - Run the using the database console to identify the projects with the largest amounts of pipelines:
select project_id, count(project_id) as pipeline_count from ci_pipelines group by project_id having count(project_id) > 10000 order by pipeline_count desc;
- You can disable auto-cancel redudant pipelines for the projects with the largest number of pipelines, or disable instance-wide by enabling the feature flag
disable_cancel_redundant_pipelines_service
in the Rails console:Feature.enable(:disable_cancel_redundant_pipelines_service)
To workaround this issue without disabling the auto-cancel redundant pipelines settings on the project:
- Allocate more CPU resources to the database server.
- If Sidekiq is overloaded, you might need to add more Sidekiq processes for the
ci_cancel_redundant_pipelines
queue if your projects have a very large number of pipelines.