fix(k8s): raise schemas CPU request from 20m to 100m #840

2026-05-19T07:08:16Z

argoyle commented

2026-05-19 07:08:16 +00:00

The HPA was pinning the deployment at maxReplicas (4) even though aggregate CPU usage was low.

Diagnosis

The service is event-driven: pods sit at ~0-1m idle but spike to 100-300m per supergraph query. With requests.cpu=20m those bursts read as 500-1500% utilization, so the HPA's 60% target was constantly exceeded on whichever pod handled the request.

Observed per-pod samples (15s apart):

sample 1 : cckcp=110m, others ~0m
sample 2 : vznkq=181m, others ~0m
sample 3 : vznkq=304m, others ~0m

HPA status confirmed: ScalingLimited TooManyReplicas + ScaleDownStabilized kept replicas pinned at 4.

Fix

Raise requests.cpu to 100m. Bursts now read as 100-300% instead of 500-1500%. Combined with the HPA's downscale stabilization window this lets replicas settle back to minReplicas (2) between bursts.

The HPA was pinning the deployment at maxReplicas (4) even though aggregate CPU usage was low. ## Diagnosis The service is event-driven: pods sit at ~0-1m idle but spike to 100-300m per supergraph query. With `requests.cpu=20m` those bursts read as 500-1500% utilization, so the HPA's 60% target was constantly exceeded on whichever pod handled the request. Observed per-pod samples (15s apart): ``` sample 1 : cckcp=110m, others ~0m sample 2 : vznkq=181m, others ~0m sample 3 : vznkq=304m, others ~0m ``` HPA status confirmed: `ScalingLimited TooManyReplicas` + `ScaleDownStabilized` kept replicas pinned at 4. ## Fix Raise `requests.cpu` to `100m`. Bursts now read as 100-300% instead of 500-1500%. Combined with the HPA's downscale stabilization window this lets replicas settle back to `minReplicas` (2) between bursts.

argoyle added 1 commit 2026-05-19 07:08:17 +00:00

fix(k8s): raise schemas CPU request from 20m to 100m

schemas / vulnerabilities (pull_request) Successful in 2m7s

Details

schemas / check (pull_request) Successful in 2m52s

Details

schemas / check-release (pull_request) Successful in 5m32s

Details

pre-commit / pre-commit (pull_request) Successful in 7m2s

Details

schemas / build (pull_request) Successful in 6m24s

Details

schemas / deploy-prod (pull_request) Has been skipped

Details

dae4e8a135

HPA was pinning the deployment at maxReplicas (4) even though aggregate
CPU usage was low. The service is event-driven: pods sit at ~0-1m idle
but spike to 100-300m per supergraph query. With requests.cpu=20m those
bursts read as 500-1500% utilization, so the HPA's 60% target was
constantly exceeded on whichever pod handled the request.

Raise the request to 100m so bursts read as 100-300% instead. Combined
with the HPA's downscale stabilization window this lets replicas settle
back to minReplicas (2) between bursts.

argoyle scheduled this pull request to auto merge when all checks succeed 2026-05-19 07:09:04 +00:00

argoyle canceled auto merging this pull request when all checks succeed 2026-05-19 07:09:50 +00:00

argoyle scheduled this pull request to auto merge when all checks succeed 2026-05-19 07:10:04 +00:00

argoyle merged commit 9a4b05d897 into main

2026-05-19 07:23:06 +00:00

argoyle deleted branch bump-schemas-cpu-request

2026-05-19 07:23:07 +00:00

argoyle referenced this pull request

2026-05-19 07:38:22 +00:00

perf(graph): cache merged SDL and SchemaUpdate per ref #841

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: unboundsoftware/schemas#840