Skip to content

ci: use pod-level Kubernetes resource variables#11811

Draft
orioly13 wants to merge 1 commit into
masterfrom
dprediger/ciexe-2021-pod-level-resources
Draft

ci: use pod-level Kubernetes resource variables#11811
orioly13 wants to merge 1 commit into
masterfrom
dprediger/ciexe-2021-pod-level-resources

Conversation

@orioly13

@orioly13 orioly13 commented Jun 30, 2026

Copy link
Copy Markdown

Summary

Migrates dd-trace-java GitLab CI from per-container Kubernetes resource
variables to pod-level variables, as part of CIEXE-2021
(epic: CIEXE-2150).

Reference: DataDog/datadog-static-analyzer#924

What changed

  • Replaced KUBERNETES_CPU_REQUEST / KUBERNETES_MEMORY_REQUEST /
    KUBERNETES_MEMORY_LIMIT in .tier_m and .tier_l anchors with
    KUBERNETES_POD_CPU_REQUEST, KUBERNETES_POD_CPU_LIMIT,
    KUBERNETES_POD_MEMORY_REQUEST, KUBERNETES_POD_MEMORY_LIMIT.
  • Updated two native-image Gradle builds (quarkus-native,
    spring-boot-3.0-native) to read the renamed env var.
  • Budgets unchanged: tier_m = 6 CPU / 16 Gi, tier_l = 10 CPU / 20 Gi.

Behavior changes

  • Pod-level budget covers the full pod (build + helper + init containers
    share one quota) instead of stacking per-container reservations.
  • KUBERNETES_POD_CPU_LIMIT is new — no CPU limit existed before.
    Jobs lose burst headroom in exchange for tighter scheduling isolation.

Rollout order (flag must precede merge)

  1. Enable ci.gitlab-runner.enable-pod-level-resources rule v3 for
    DataDog/dd-trace-java at https://mosaic.us1.ddbuild.io/feature-flags/ci.gitlab-runner.enable-pod-level-resources?targeting-rule=v3
  2. Trigger draft-branch pipeline; inspect one tier_m + tier_l + arm64
    pod spec — confirm spec.resources shows pod-level budget, containers
    show empty resources.
  3. Mark PR ready → review → merge.

Rollback

Revert merge commit first, then disable flag — not the other way
around. Flag-off while YAML is on master = jobs run with scheduler
defaults = OOM risk on tier_l native-image jobs.

Replace per-container KUBERNETES_CPU_REQUEST / KUBERNETES_MEMORY_*
with pod-level KUBERNETES_POD_* vars in both tier_m and tier_l
anchors. Two native-image Gradle builds are updated to read the
renamed env var for CPU parallelism sizing.

Behavior changes:
- Resources now budget the full pod (build + helper + init
  containers share a single quota) instead of only the build
  container, reducing the effective per-job cluster footprint.
- KUBERNETES_POD_CPU_LIMIT added (no CPU limit existed before);
  jobs lose burst headroom in exchange for tighter scheduling
  isolation. Tier_m: 6 CPU / 16Gi. Tier_l: 10 CPU / 20Gi.

Feature flag ci.gitlab-runner.enable-pod-level-resources (rule v3)
must be enabled for this repo before merging to master. Draft PR
opened for pod-spec validation on the flag-enabled branch first.

Refs: CIEXE-2021, CIEXE-2150
@orioly13 orioly13 added tag: no release notes Changes to exclude from release notes type: refactoring labels Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tag: no release notes Changes to exclude from release notes type: refactoring

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant