Skip to content

fix(linux): lower host glibc floor to 2.28 to support RHEL/Rocky 8#1934

Open
pimlock wants to merge 18 commits into
mainfrom
codex/1456-gateway-glibc-228
Open

fix(linux): lower host glibc floor to 2.28 to support RHEL/Rocky 8#1934
pimlock wants to merge 18 commits into
mainfrom
codex/1456-gateway-glibc-228

Conversation

@pimlock

@pimlock pimlock commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

Lower the published Linux GNU host-binary glibc floor to 2.28 so OpenShell packages can run on RHEL 8 / Rocky Linux 8 class hosts. The gateway and VM driver release paths now specify explicit cargo-zigbuild glibc 2.28 targets instead of relying on Zig's default GNU target behavior.

The gateway keeps bundled-z3 for Linux release artifacts, so the released gateway remains self-contained instead of adding a system Z3 runtime requirement to each packaging surface.

Related Issue

Fixes #1937
Refs #1456

Changes

  • Build Linux openshell-gateway release artifacts, image staging artifacts, and local prebuilt gateway artifacts with explicit x86_64-unknown-linux-gnu.2.28 / aarch64-unknown-linux-gnu.2.28 cargo-zigbuild targets.
  • Keep Linux gateway release builds on --features bundled-z3.
  • Add tasks/scripts/setup-zig-cc-wrapper.sh so z3-sys CMake builds can use Zig C/C++ with an explicit glibc 2.28 target.
  • Build Linux openshell-driver-vm release artifacts with explicit glibc 2.28 targets.
  • Verify Linux gateway and VM driver artifacts with tasks/scripts/verify-glibc-symbols.sh 2.28.
  • Key the gateway Rust cache on the Zig wrapper helper and clear only stale z3-sys CMake cache state that points at old cargo-zigbuild wrapper paths.
  • Add LD_BIND_NOW=1 to Linux smoke checks that execute the gateway.
  • Lower the Linux package installer glibc preflight to 2.28 and update installer/support/build documentation.
  • Remove the temporary package-smoke-only manual workflow input after using it to validate the package smoke path on this PR.

Investigation Notes

Previous Gateway Release Artifact

Inspected release v0.0.63, artifact openshell-gateway-x86_64-unknown-linux-gnu.tar.gz.

  • SHA256: cafe1915d15cfdcfcb9b55f1022ae08ce2b46198d208003a525e58c26e66729d
  • Previous highest required symbols were GLIBC_2.29 and GLIBC_2.30.
  • Blocking refs:
    • GLIBC_2.29: log, log2, exp, pow, exp2
    • GLIBC_2.29: posix_spawn_file_actions_addchdir_np
    • GLIBC_2.30: pthread_cond_clockwait
    • GLIBC_2.30: gettid

Branch-Current Gateway glibc 2.28 Build

The current release/image/staging paths use explicit glibc 2.28 targets and bundled Z3:

uv run --with cmake -- mise x -- cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.28 -p openshell-server --bin openshell-gateway --features bundled-z3

The explicit .2.28 target needed one extra piece for bundled Z3: z3-sys invokes CMake and Zig C/C++, where Zig expects the vendorless target form (x86_64-linux-gnu.2.28) instead of Rust's target triple (x86_64-unknown-linux-gnu.2.28).

The wrapper helper fixes that by:

  • Setting CC_*, CXX_*, and CMAKE_TOOLCHAIN_FILE_* environment variables for both suffixed and bare GNU target forms.
  • Stripping caller-provided --target / -target C/C++ flags that cc/CMake may inject.
  • Re-invoking Zig C/C++ with the normalized glibc target.
  • Removing only stale z3-sys CMake build directories that still reference old cargo-zigbuild wrapper paths.

The release verifier remains the guardrail:

tasks/scripts/verify-glibc-symbols.sh 2.28 target/x86_64-unknown-linux-gnu/release/openshell-gateway

Focused symbol results from the compatible build:

  • Old math refs bind to GLIBC_2.2.5: log, log2, exp, pow, exp2.
  • Highest remaining required symbols include fcntl64@GLIBC_2.28 and weak statx@GLIBC_2.28.
  • pthread_cond_clockwait@GLIBC_2.30 is gone. The binary has pthread_cond_timedwait@GLIBC_2.3.2.
  • gettid, posix_spawn_file_actions_addchdir_np, and posix_spawn_file_actions_addchdir are present only as unversioned weak undefined symbols, so they do not raise the GLIBC floor.
  • No GLIBC_2.29 or GLIBC_2.30 versioned references remain.

Weak-Symbol Compatibility Layer

The symbols that caused concern are either gone from the new binary or are guarded weak lookups with older-kernel/libc fallback paths:

  • Rust std's posix_spawn_file_actions_addchdir_np lookup is guarded. If unavailable, get_posix_spawn_addchdir() returns None, posix_spawn() returns Ok(None), and Command falls back to fork/exec. The fork/exec path applies cwd with chdir.
  • Rust std and rustix gettid paths use weak lookup with raw syscall(SYS_gettid) fallback because glibc's wrapper is only available in glibc 2.30+.
  • Linux Rust std condvars use the futex backend in this build. pthread_cond_clockwait is not referenced in the new binary.
  • OpenShell gateway/server-side code does not directly call these weak symbols.

Bundled Z3 and Cache Behavior

We briefly evaluated unbundling Z3, but kept the bundled path. The details are in the issue comment.

The wrapper changed the gateway cache key through shared-key, not the key input. A probe showed that key: did not affect the computed Swatinem/rust-cache primary key for this workflow; the working form is:

shared-key: gateway-binary-gnu-${{ matrix.arch }}-zig-wrapper-${{ hashFiles('tasks/scripts/setup-zig-cc-wrapper.sh') }}

Workflow evidence:

VM Driver Package Contract

  • Debian packaging stages openshell-driver-vm into the Linux package at /usr/libexec/openshell/openshell-driver-vm.
  • The installation docs say the Debian package installs VM sandbox support.
  • This PR changes .github/workflows/driver-vm-linux.yml so Linux VM driver release artifacts are built with explicit .2.28 GNU Zig targets and verified with verify-glibc-symbols.sh 2.28.
  • This keeps package-managed VM support from silently raising the runtime glibc requirement above the package preflight.

Rocky Linux 8 Gateway Smoke

Ran the branch-current gateway binary on rockylinux:8 with Podman:

/opt/podman/bin/podman run --rm --platform linux/amd64 \
  -v /Users/pmlocek/dev/navigator/target/x86_64-unknown-linux-gnu/release/openshell-gateway:/usr/local/bin/openshell-gateway:ro \
  rockylinux:8 \
  bash -lc 'ldd --version | head -1; LD_BIND_NOW=1 openshell-gateway --version; LD_BIND_NOW=1 openshell-gateway --help >/tmp/gateway-help.txt; head -8 /tmp/gateway-help.txt; echo -- weak-symbol-export-check --; objdump -T /lib64/libc.so.6 | grep -E "(gettid|posix_spawn_file_actions_addchdir_np)" || true'

Observed:

  • ldd (GNU libc) 2.28
  • LD_BIND_NOW=1 openshell-gateway --version printed openshell-gateway 0.0.64-dev.9+g294c64ee.
  • LD_BIND_NOW=1 openshell-gateway --help rendered the help header and commands.
  • Rocky 8 libc did not export gettid or posix_spawn_file_actions_addchdir_np, confirming the gateway starts when those weak symbols are unavailable.

Testing

  • mise run pre-commit (passing)
  • bash tasks/scripts/test-install-sh.sh
  • bash -n tasks/scripts/setup-zig-cc-wrapper.sh tasks/scripts/stage-prebuilt-binaries.sh tasks/scripts/verify-glibc-symbols.sh tasks/scripts/docker-build-image.sh tasks/scripts/snap-gateway-wrapper.sh
  • Workflow and snapcraft.yaml YAML parse with PyYAML.
  • bash tasks/scripts/test-packaging-assets.sh
  • uv run --frozen pytest python/openshell/release_formula_test.py -q
  • git diff --check
  • Local explicit glibc 2.28 gateway build with --features bundled-z3.
  • tasks/scripts/verify-glibc-symbols.sh 2.28 target/x86_64-unknown-linux-gnu/release/openshell-gateway
  • Rocky Linux 8 / glibc 2.28 Podman smoke with LD_BIND_NOW=1 openshell-gateway --version and LD_BIND_NOW=1 openshell-gateway --help.
  • Release Dev package-smoke run with bundled Z3: https://github.com/NVIDIA/OpenShell/actions/runs/27658312503
  • Release Dev cache seed run: https://github.com/NVIDIA/OpenShell/actions/runs/27660740650
  • Release Dev cache reuse run: https://github.com/NVIDIA/OpenShell/actions/runs/27661473950
  • Final full PR CI after removing the temporary package-smoke workflow input.

Checklist

  • PR title follows Conventional Commits.
  • Commits are signed off for DCO compliance.
  • Documentation updated.

Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions

Copy link
Copy Markdown

@pimlock pimlock added the test:e2e Requires end-to-end coverage label Jun 16, 2026
@pimlock

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
@pimlock pimlock changed the title fix(gateway): lower glibc floor to 2.28 fix(linux): lower host glibc floor to 2.28 Jun 16, 2026
@pimlock

This comment was marked as outdated.

Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
@pimlock

This comment was marked as outdated.

Comment thread architecture/build.md Outdated
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
@pimlock

pimlock commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

/ok to test 1b04fd7

Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
pimlock added 4 commits June 16, 2026 17:17
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
pimlock added 2 commits June 16, 2026 18:16
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
@pimlock

pimlock commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

/ok to test f7a9871

@pimlock pimlock force-pushed the codex/1456-gateway-glibc-228 branch from 54e9ef0 to 09e66a9 Compare June 17, 2026 17:17
@pimlock pimlock requested a review from elezar June 17, 2026 17:20
@pimlock

pimlock commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

/ok to test 244a281

@pimlock pimlock marked this pull request as ready for review June 17, 2026 17:20
@pimlock pimlock changed the title fix(linux): lower host glibc floor to 2.28 fix(linux): lower host glibc floor to 2.28 to support RHEL/Rocky 8 Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: expand gateway glibc compatibility

2 participants