Skip to content

fix(e2e): align container engine selection across helpers#1944

Open
elezar wants to merge 1 commit into
mainfrom
fix/1481-align-container-engine-selection/elezar
Open

fix(e2e): align container engine selection across helpers#1944
elezar wants to merge 1 commit into
mainfrom
fix/1481-align-container-engine-selection/elezar

Conversation

@elezar

@elezar elezar commented Jun 17, 2026

Copy link
Copy Markdown
Member

Summary

Align container engine selection across build helpers, e2e wrappers, Rust e2e support containers, and Skaffold local image builds. CONTAINER_ENGINE=docker|podman is now the single explicit selector, with CONTAINER_ENGINE_TARGET=local-k8s-cluster used only for local-cluster image workflows.

Related Issue

Closes #1481

Changes

  • Centralize shell container-engine precedence in tasks/scripts/container-engine.sh: explicit CONTAINER_ENGINE, e2e driver requirement, local-cluster target hint, then host auto-detection.
  • Reject the removed OPENSHELL_E2E_CONTAINER_ENGINE selector and conflicting explicit engine selections.
  • Harden ce_info_arch so malformed or unavailable engine metadata fails directly instead of falling back to host architecture.
  • Align Docker/Podman e2e wrappers and Rust e2e support containers with the same e2e selection rules.
  • Switch Skaffold custom image builds from hard-coded Docker to CONTAINER_ENGINE_TARGET=local-k8s-cluster.
  • Document the precedence model and local-cluster scope in architecture/build.md.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Commands run:

  • bash -n tasks/scripts/container-engine.sh
  • bash -n e2e/with-docker-gateway.sh
  • bash -n e2e/with-podman-gateway.sh
  • git diff --check
  • Shell fixture checks for explicit Docker selection, removed selector rejection, Docker/Podman conflict rejection, kind+Podman local-cluster inference, and ce_info_arch malformed metadata rejection.
  • mise x -- cargo test --manifest-path e2e/rust/Cargo.toml --lib harness::container
  • mise x -- cargo test --manifest-path e2e/rust/Cargo.toml --no-run --test gpu_device_selection --features e2e-gpu
  • mise x -- cargo test --manifest-path e2e/rust/Cargo.toml --no-run --test local_driver_token_restart --features e2e
  • mise run pre-commit was attempted twice. It did not complete successfully because helm:lint fails on the existing chart dependency state: chart metadata is missing these dependencies: postgresql; the second run also timed out while the Rust test task was still running.

Not run:

  • Docker/Podman e2e lanes. Docker CLI is installed but the Docker daemon is not reachable, and podman is not installed on this machine.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@elezar elezar added area:build Related to CI/CD and builds topic:testing test:e2e Requires end-to-end coverage labels Jun 17, 2026
@github-actions

Copy link
Copy Markdown

Label test:e2e applied for 649bef8. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute the standard E2E suite after building the required gateway and supervisor images once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

@elezar elezar force-pushed the fix/1481-align-container-engine-selection/elezar branch 2 times, most recently from d34468c to b6685c6 Compare June 17, 2026 09:36
Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar force-pushed the fix/1481-align-container-engine-selection/elezar branch from b6685c6 to 66bdf87 Compare June 17, 2026 13:00
@maxamillion

Copy link
Copy Markdown
Collaborator

@elezar this is good, I like this approach. I'm going to close #1220 since this negates the need for it.

@maxamillion

Copy link
Copy Markdown
Collaborator

@elezar it looks like if the env vars are left unset / empty, then the rust helper will fall back to docker but the shell script will pick podman if it's installed. I'm not sure how likely it is that someone has both installed and someone bypasses the wrappers by running cargo test with the correct flags to hit this issue, but if they do it could get weird. There might be other scenarios too that I'm not thinking of.

e2e/rust/src/harness/container.rs:228-230

Ok(explicit_engine
    .or(required_engine)
    .unwrap_or_else(|| "docker".to_string()))

tasks/scripts/container-engine.sh:87-103

if command -v podman >/dev/null 2>&1; then
  echo "podman"
  return
fi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:build Related to CI/CD and builds test:e2e Requires end-to-end coverage topic:testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(e2e): align container engine selection across helpers

2 participants