fix(k8s): raise schemas CPU request from 20m to 100m #840

Merged
argoyle merged 1 commits from bump-schemas-cpu-request into main 2026-05-19 07:23:06 +00:00
Owner

The HPA was pinning the deployment at maxReplicas (4) even though aggregate CPU usage was low.

Diagnosis

The service is event-driven: pods sit at ~0-1m idle but spike to 100-300m per supergraph query. With requests.cpu=20m those bursts read as 500-1500% utilization, so the HPA's 60% target was constantly exceeded on whichever pod handled the request.

Observed per-pod samples (15s apart):

sample 1 : cckcp=110m, others ~0m
sample 2 : vznkq=181m, others ~0m
sample 3 : vznkq=304m, others ~0m

HPA status confirmed: ScalingLimited TooManyReplicas + ScaleDownStabilized kept replicas pinned at 4.

Fix

Raise requests.cpu to 100m. Bursts now read as 100-300% instead of 500-1500%. Combined with the HPA's downscale stabilization window this lets replicas settle back to minReplicas (2) between bursts.

The HPA was pinning the deployment at maxReplicas (4) even though aggregate CPU usage was low. ## Diagnosis The service is event-driven: pods sit at ~0-1m idle but spike to 100-300m per supergraph query. With `requests.cpu=20m` those bursts read as 500-1500% utilization, so the HPA's 60% target was constantly exceeded on whichever pod handled the request. Observed per-pod samples (15s apart): ``` sample 1 : cckcp=110m, others ~0m sample 2 : vznkq=181m, others ~0m sample 3 : vznkq=304m, others ~0m ``` HPA status confirmed: `ScalingLimited TooManyReplicas` + `ScaleDownStabilized` kept replicas pinned at 4. ## Fix Raise `requests.cpu` to `100m`. Bursts now read as 100-300% instead of 500-1500%. Combined with the HPA's downscale stabilization window this lets replicas settle back to `minReplicas` (2) between bursts.
argoyle added 1 commit 2026-05-19 07:08:17 +00:00
fix(k8s): raise schemas CPU request from 20m to 100m
schemas / vulnerabilities (pull_request) Successful in 2m7s
schemas / check (pull_request) Successful in 2m52s
schemas / check-release (pull_request) Successful in 5m32s
pre-commit / pre-commit (pull_request) Successful in 7m2s
schemas / build (pull_request) Successful in 6m24s
schemas / deploy-prod (pull_request) Has been skipped
dae4e8a135
HPA was pinning the deployment at maxReplicas (4) even though aggregate
CPU usage was low. The service is event-driven: pods sit at ~0-1m idle
but spike to 100-300m per supergraph query. With requests.cpu=20m those
bursts read as 500-1500% utilization, so the HPA's 60% target was
constantly exceeded on whichever pod handled the request.

Raise the request to 100m so bursts read as 100-300% instead. Combined
with the HPA's downscale stabilization window this lets replicas settle
back to minReplicas (2) between bursts.
argoyle scheduled this pull request to auto merge when all checks succeed 2026-05-19 07:09:04 +00:00
argoyle canceled auto merging this pull request when all checks succeed 2026-05-19 07:09:50 +00:00
argoyle scheduled this pull request to auto merge when all checks succeed 2026-05-19 07:10:04 +00:00
argoyle merged commit 9a4b05d897 into main 2026-05-19 07:23:06 +00:00
argoyle deleted branch bump-schemas-cpu-request 2026-05-19 07:23:07 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: unboundsoftware/schemas#840