eefab471d6
schemas / vulnerabilities (pull_request) Successful in 2m2s
schemas / check (pull_request) Successful in 2m42s
schemas / check-release (pull_request) Successful in 2m59s
pre-commit / pre-commit (pull_request) Successful in 6m18s
schemas / build (pull_request) Successful in 5m47s
schemas / deploy-prod (pull_request) Has been skipped
Even with the schema cache and startup warmup, residual CPU bursts (metrics-server occasionally samples a request mid-flight) were enough to trip a brief scale-up, after which the default 5min scaleDown stabilization pinned the deployment at maxReplicas long after the spike had subsided. Tune both directions: - scaleUp.stabilizationWindowSeconds: 120 — a transient spike must persist for two consecutive minutes before any pod is added. Brief metric anomalies no longer move replicas. - scaleUp policy: add at most 1 pod per 60s. Smooths reaction. - scaleDown.stabilizationWindowSeconds: 120 (default 300) — once the workload calms, return to minReplicas faster. - scaleDown policy: remove at most 1 pod per 60s. Avoids thundering-herd scale-down.
41 lines
1.1 KiB
YAML
41 lines
1.1 KiB
YAML
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
labels:
|
|
app.kubernetes.io/name: schemas
|
|
name: schemas
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: schemas
|
|
minReplicas: 2
|
|
maxReplicas: 4
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 60
|
|
behavior:
|
|
scaleUp:
|
|
# Wait 2min of sustained high CPU before scaling up. Schemas is
|
|
# event-driven and the per-request work is bursty even with the
|
|
# cache + warmup, so single spikes shouldn't pull replicas up.
|
|
stabilizationWindowSeconds: 120
|
|
policies:
|
|
- type: Pods
|
|
value: 1
|
|
periodSeconds: 60
|
|
scaleDown:
|
|
# Default 300s window kept pods pinned at maxReplicas long after
|
|
# the triggering spike had subsided. 120s is long enough to avoid
|
|
# flapping but lets the deployment return to minReplicas quickly
|
|
# once the workload calms.
|
|
stabilizationWindowSeconds: 120
|
|
policies:
|
|
- type: Pods
|
|
value: 1
|
|
periodSeconds: 60
|