chore: Get a VM running by markovejnovic · Pull Request #31 · harmont-dev/hyper

markovejnovic · 2026-06-25T21:53:47Z

No description provided.

…oup v2, suidhelper)

Mix spawns subprocesses in their own session with no controlling terminal (erl_child_setup calls setsid), so the nested `sudo install` could never prompt for a password and always failed with "a terminal is required". Split the task: `cargo xtask stamp` builds + stamps unprivileged, then the privileged copy runs via sudo only when it is already non-interactive (`sudo -n` succeeds); otherwise it prints the exact command to run by hand.

The unprivileged node used to pass `--bin <path>` to the setuid helper for every losetup/dmsetup/blockdev op. Letting the caller name the binary the helper escalates to run is a needless trust hole, even with SafeBin checks. Move the paths into the helper-owned config (/etc/hyper/config.toml) with sane defaults (/usr/sbin/{losetup,dmsetup,blockdev}); the helper validates each as a SafeBin (absolute, root-owned, non-writable, exact basename) at dispatch, as the real uid, before acquiring root. The `--bin` argument is gone from both sides. Also: - Add a `dmsetup targets` op so the dm-target readiness probe runs through the helper (it opens /dev/mapper/control, which needs root) instead of shelling dmsetup directly as the BEAM user. - An absent config file now falls back to the built-in defaults (trusted, compiled into the root-owned binary); a present-but-untrusted file stays fatal. Drop the Elixir-side *_path config and per-tool presence checks. - Add :mix to the dialyzer PLT so the Mix tasks resolve Mix.raise/shell.

Document the Docker Postgres setup matching config.exs defaults, the now config-sourced (and optional) device-binary paths, and the sudo/tty caveat for mix suidhelper.install.

Spell out modprobe + dmsetup targets verify, the linux-modules-extra fallback for stripped cloud kernels, and persisting via modules-load.d - the missing_dm_targets boot failure points here.

mkdir under the cgroup-v2 hierarchy, enable cpu/memory in subtree_control (root-level fallback), and persist via systemd-tmpfiles since cgroupfs is memory-backed. The :missing_parent_cgroup boot failure points here.

Budget.Advertiser advertises on init, which calls Hyper.Node.Layer.active/0 and selects on Hyper.Node.Layer.Registry. That registry is owned by Hyper.Node.Layer, which was ordered after Budget.Supervisor - so boot crashed with `unknown registry: Hyper.Node.Layer.Registry`. Order Layer first.

ThinPool traps exits (for terminate/2 device teardown), so the normal close of each System.cmd port arrived as an unhandled {:EXIT, port, :normal} and logged as an error. Add a handle_info clause to ignore normal exits. Tracing defaulted to Honeycombs endpoint but only sent the auth header when HONEYCOMB_API_KEY was set, so keyless runs 401d on every batch. Export only when a key or a custom OTEL endpoint is configured; otherwise disable the exporter.

An unclean shutdown (SIGKILL, so terminate/2 never ran) leaves hyper-thinpool behind; the next boot failed with "device-mapper: create ioctl ... Device or resource busy". Best-effort remove any pre-existing pool before rebuilding, ahead of zero_metadata so a still-live pool is never corrupted.

Hyper.Img.OciLoader.test_system/0 (checks skopeo/umoci/mke2fs) existed but was never called from Hyper.Node.test_system, so a missing skopeo only surfaced at image-load time. Wire it in after Umoci.ensure_installed so the node refuses to boot when an OCI loader tool is absent, like every other host requirement.

codecov · 2026-06-25T21:54:47Z

Codecov Report

❌ Patch coverage is 37.34291% with 349 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
native/suidhelper/src/tools/jailer.rs	42.85%	92 Missing ⚠️
lib/mix/tasks/firecracker.install.ex	0.00%	34 Missing ⚠️
lib/hyper/node/fire_vmm/state.ex	0.00%	28 Missing ⚠️
lib/hyper/node/reclaim.ex	0.00%	27 Missing ⚠️
native/suidhelper/src/config.rs	73.61%	19 Missing ⚠️
lib/mix/tasks/suidhelper.install.ex	0.00%	14 Missing ⚠️
lib/hyper/node.ex	0.00%	13 Missing ⚠️
...tive/suidhelper/src/tools/chroot_jail/grant_api.rs	74.46%	12 Missing ⚠️
native/suidhelper/src/tools/dmsetup/mod.rs	0.00%	12 Missing ⚠️
native/suidhelper/src/tools/losetup.rs	0.00%	12 Missing ⚠️
... and 21 more

📢 Thoughts on this report? Let us know!

Layer.Server, Img.Mutable and Img.Server all trap exits (for terminate/2 device teardown) and shell privileged commands through System.cmd, which links a transient port to the process. The port's normal close arrived as {:EXIT, port, :normal} with no matching handle_info clause - Layer.Server crash-looped on it while mounting a layer, which surfaced to create_vm as :no_capacity. Add the same ignore-clause already in ThinPool to each.

github-actions · 2026-06-25T22:00:59Z

Test Results

305 tests +55 304 ✅ +55 5s ⏱️ ±0s
55 suites + 6 1 💤 ± 0
2 files ± 0 0 ❌ ± 0

Results for commit f2dc209. ± Comparison against base commit 3603c45.

This pull request removes 1 and adds 56 tests. Note that renamed tests count towards both.

hyper-suidhelper::e2e_config ‑ missing_config_exits_2

Elixir.Hyper.Node.FireVMM.JailerTest ‑ property gen_vm_id/0 produces only alphanumeric ids
Elixir.Hyper.Node.FireVMM.JailerTest ‑ test args contain --id, --uid, --gid with the opts values
Elixir.Hyper.Node.FireVMM.JailerTest ‑ test args do not contain privileged flags owned by the suidhelper
Elixir.Hyper.Node.FireVMM.JailerTest ‑ test args end with --api-sock /api.socket
Elixir.Hyper.Node.FireVMM.JailerTest ‑ test args include --cgroup cpu.max and memory.max for :micro type
Elixir.Hyper.Node.FireVMM.JailerTest ‑ test args start with the jailer subcommand
Elixir.Hyper.Node.FireVMM.JailerTest ‑ test binary is the suid helper
Elixir.Hyper.SuidHelper.DmsetupPropertiesTest ‑ property parse_names recovers the device name from each `dmsetup ls` row
Elixir.Hyper.SuidHelper.DmsetupPropertiesTest ‑ test parse_names reads the empty-table sentinel as no devices
Elixir.Hyper.SuidHelper.LosetupTest ‑ test an empty listing yields no pairs
…

♻️ This comment has been updated with latest results.

@callback

DynamicSupervisor.start_child rejected the FireVMM child spec with {:invalid_child_spec, ...} because the map used a :vm_id key where a supervisor child spec requires :id - so no VM could ever boot. Use :id. Add @SPEC child_spec(Opts.t()) :: Supervisor.child_spec(). child_spec/1 is a plain def (not a typed @callback), so without a spec dialyzer never compared the returned map against the child-spec contract and the typo passed the gate. With the spec, the bad key fails dialyzer as invalid_contract.

place/3 reduced every candidate failure into a blanket {:error, :no_capacity}, so a real boot error on the only candidate was indistinguishable from genuine lack of capacity. Log the actual refusal reason at each candidate.

An unclean shutdown (SIGKILL / :erlang.halt, where terminate/2 never runs) leaves hyper dm devices and loop devices behind. The next boot then crashed in ThinPool.init with "device-mapper: create ioctl on hyper-thinpool: Device or resource busy", and a leaked thin volume held the pool open so the previous remove-the-pool-only reclaim could not clear it. Add read-only `dmsetup ls` and `losetup --list` ops to the suidhelper, and a Hyper.Node.Reclaim pass that runs once before any device GenServer starts: it removes every hyper-prefixed dm device leaf-first (retrying leftovers until a pass clears nothing new, so stacked snapshots and the pool-under-volume case resolve) then detaches loop devices backing files under the data dirs. Replaces ThinPool.init's pool-only reclaim. Best-effort; logs and continues.

Extends the suidhelper config with four new fields needed before the jailer exec path can be wired up: - firecracker / jailer: Option<PathBuf> — no built-in default; accessors return BinError::Unconfigured when absent so callers get a clear "operator must configure this" signal rather than a path-validation error. - parent_cgroup: String — defaults to "hyper", matching Elixir's @parent_cgroup (operators narrowing must keep both in sync). - uid_gid_range: Option<UidGidRange> — total accessor returning (900_000, 999_999) when absent; a present range with min==0 or min>max is fatal at safe_load time, consistent with the "present-but-untrusted is fatal" trust model. Adds BinError (Unconfigured | Bin) so firecracker/jailer accessors have a richer error type than the existing tool accessors (which can never be Unconfigured). Adds validate_uid_gid_range as a public pure function so the refusal contract can be property-tested without touching the file system. New test target config_uid_gid_range exercises: absent → default, valid range round-trips via TOML, min==0 always rejected, min>max always rejected. Extends e2e/config with: Unconfigured on absent keys, basename mismatch rejected, Ok on root-owned correct-name binaries (root-guarded), bad uid_gid_range binary exit 2 (root-guarded).

…ec failure

The BEAM no longer names a privileged binary path. `Jailer.command/1` now launches `hyper-suidhelper jailer` with only the id, uid/gid, cgroup flags, and api-sock; the helper reads firecracker/jailer binary paths, chroot base, parent cgroup, and cgroup version from its trusted /etc/hyper/config.toml, re-acquires root, and execve's the jailer (same pid, so MuonTrap owns the lifetime). Changes: - config.ex: add config_toml/0 (file → persistent_term cache); rewrite work_dir/0 over it; add firecracker_bin/0 + jailer_bin/0 via fetch_bin!/1 (raises when key absent); remove firecracker_install_dir/0. - provider.ex: deleted. - jailer.ex: rewire command/1 and exec_name/0; drop Provider alias. - daemon.ex: note that supervised process is the helper (execve → jailer). - node.ex: replace Provider.ensure_installed() with check_firecracker_bins/0. - hyper.ex: prefix gen_vm_id/0 with "v" so ids never start with "-". - jailer_test.exs: new pure test suite; load-bearing assertion that args contain no --exec-file / --chroot-base-dir / -- flags.

…r/jailer Downloads the pinned Firecracker v1.16.0 release via Redist.Targz.install/3 (download → SHA-256 verify → extract), copies the version-stamped binaries to bare-basename paths (<prefix>/firecracker and <prefix>/jailer) required by the suidhelper's SafeBin validator, and prints the config snippets the operator needs to paste.

…nstall Remove firecracker/jailer from the auto-redistributed list; operators now install them via `mix firecracker.install [--prefix <dir>]`. Update /etc/hyper/config.toml example to show the required firecracker/jailer keys (no default, root-owned + non-world-writable, bare basenames validated by the helper) and the optional [uid_gid_range] table. Update `config :hyper` snippet to drop the auto-download TODOs and point at the install-produced paths. Document that uid_gid_range in config :hyper and [uid_gid_range] in config.toml must be kept in sync.

The Elixir node was fetching TOML keys "firecracker_bin"/"jailer_bin" while the setuid helper (and the TOML example in docs) uses "firecracker"/"jailer". A correctly-installed host therefore crashed on launch with an unset-key error even when the file was present. - config.ex: fix fetch_bin! keys to "firecracker"/"jailer"; add non-raising firecracker_bin_configured/0 and jailer_bin_configured/0 returning {:ok, path} | :error for use by pre-launch checks. - node.ex: rewrite check_firecracker_bins/0 to use the non-raising accessors so a missing key returns {:error, :firecracker_not_configured} instead of raising, honouring the @SPEC contract. - firecracker.install.ex: drop the dead "config :hyper, firecracker_bin:" snippet from print_config/1 — nothing reads Application env for these paths. - docs/cookbook/intro.md: remove firecracker_bin/jailer_bin from config :hyper block; clarify paths live only in /etc/hyper/config.toml. - jailer_test.exs: align stub map keys to "firecracker"/"jailer" (the bug that masked this); add positive --cgroup assertion for :micro type. - Cargo.toml: remove redundant toml dev-dependency (already a normal dep).

The task runs unprivileged, so the binaries land owned by the invoking user. SafeBin in the suidhelper refuses any jailer/firecracker not owned by root, so every launch fails closed until they are chowned. Print the exact chown/chmod commands (with the real installed paths) instead of a vague 'ensure root-owned' note.

The supervised process is MuonTrap.Daemon, so route the jailed process's stdout+stderr (guest serial console included) to the Logger via log_output, and map the exit status into {:firecracker_exited, status} so a crash names the real exit code instead of the opaque :error_exit_status. Add explicit launch/launch-failure log lines keyed by vm id, so a boot loop is visible without reading console scrollback.

The :awaiting_api probe swallowed the describe_instance error and stopped with a bare :daemon_unready, hiding why a healthy-looking firecracker is unreachable. Log the last probe error on deadline and carry it in the stop reason ({:daemon_unready, reason}), so a host->jail socket permission/path problem is diagnosable instead of looking like an unexplained 5s restart loop.

… the node user The jailer drops firecracker to a per-VM uid/gid and chroots it, so the API socket it creates at <jail>/root/api.socket is owned by that per-VM id, mode 0755. Connecting a unix socket needs write permission, so the unprivileged node controller gets EACCES on connect(). grant-api confines the socket path under JAIL_BASE (SafePath + O_NOFOLLOW walk, fd-relative on the pinned root dir), verifies the leaf is a real socket via fstatat(AT_SYMLINK_NOFOLLOW) (a planted file/symlink is refused, never touched), then chowns it to the helper's caller (getuid/getgid, the real=caller ids inside the privileged scope) and chmods 0660. A not-yet-created socket is reported Pending, not an error. Adds SafeDir::stat and SafeDir::chmod fd-relative primitives.

… can connect AwaitingApi now hands the jailed API socket to the node user (via the new chroot-jail grant-api op) before each readiness probe. firecracker creates the socket owned by the per-VM uid, so the controller gets EACCES until the helper chowns it; granting first lets the probe (and every later API call) connect. The grant runs once (tracked by State.api_granted); :socket_pending and transient grant errors keep the controller waiting until the existing boot deadline rather than crashing, via a shared deadline-aware keep_probing path.

gen_vm_id used Base.url_encode64, which emits - and _. firecracker rejects _ in an instance id (InvalidInstanceId / "Invalid char (_)"), so any id containing one crash-looped the jailer at boot. Switch to lowercase base32 ([a-z2-7], alphanumeric only) - the intersection of the firecracker, dm/jailer, and registry-key constraints. Strengthen the property from "no leading -" to "strictly alphanumeric".

Starting a process with a {:via, Horde.Registry, _} name makes OTP run gen:get_proc_name right after start: it calls whereis_name immediately after the synchronous register. Horde materialises the name into local ETS only asynchronously (DeltaCRDT diff loop), so under registry churn the read loses the race and startup aborts with {:process_not_registered_via, Horde.Registry}. The crash-storm from the bad vm_id flooded the CRDT and tipped this over, killing FireVMM.State. Add Routing.register_self/1 and have the supervisor, client, and state machine register from their own init (started unnamed). Names stay cluster-resolvable via via/1 once the diff propagates - callers already tolerate that lag. Core starts unnamed entirely (resolved by nobody).

grant-api chowned the API socket to the caller, but the jailer leaves <id>/root as 0700 owned by the per-VM uid. connect() needs search (+x) on every ancestor, so the unprivileged node still got EACCES traversing into root - the socket's owner was irrelevant. The op now also opens that one directory to the caller's group: owner stays the per-VM uid (firecracker needs it), chgrp to the caller's gid, chmod 0710 (owner rwx, group --x traverse-not-list, other none). Unrelated users stay locked out; only the socket and its parent's group/mode move. Add SafeDir::chmod_self/chgrp_self (fchmod/fchown on the pinned root fd, not by name - TOCTOU-safe). Extend the grant test to assert root is chgrp'd to the caller and chmod'd 0710.

The rootfs device handed to chroot-jail staging is /dev/mapper/hyper-rw-<id>, a symlink into the dm farm. SafeFile<IsBlockDevice> opens it O_NOFOLLOW (it must never follow an attacker-supplied symlink), so the dm symlink itself was rejected with "file is not of the required type" and the guest never booted. Resolve the device to its real /dev/dm-N node via canonicalize before the open. Safe because the BlockDev lexical guard already proved the name is a hyper-owned dm device and /dev/mapper is a root-owned 0755 dir an unprivileged caller cannot redirect; loop nodes canonicalize to themselves. The re-opened target is still verified IsBlockDevice.

Both staging and config-apply failures end the boot via a :one_for_all supervisor restart, so without an explicit log the reason vanished into the restart cycle and the VM just appeared to relaunch for no visible cause. Log the reason at each failure path before stopping.

markovejnovic added 12 commits June 25, 2026 20:09

feat(suidhelper): mix suidhelper.install wrapping cargo xtask install

d5bffcc

docs: document full node requirements (postgres, dm targets, kvm, cgr…

9201730

…oup v2, suidhelper)

docs: replace em-dashes with ASCII hyphens in intro.md

b68dd5e

docs: Postgres quickstart, optional helper config, install caveats

f340571

Document the Docker Postgres setup matching config.exs defaults, the now config-sourced (and optional) device-binary paths, and the sudo/tty caveat for mix suidhelper.install.

docs: how to load device-mapper modules for the dm targets

23a5883

Spell out modprobe + dmsetup targets verify, the linux-modules-extra fallback for stripped cloud kernels, and persisting via modules-load.d - the missing_dm_targets boot failure points here.

docs: how to create the parent cgroup + delegate controllers

b0c0b5f

mkdir under the cgroup-v2 hierarchy, enable cpu/memory in subtree_control (root-level fallback), and persist via systemd-tmpfiles since cgroupfs is memory-backed. The :missing_parent_cgroup boot failure points here.

markovejnovic added 2 commits June 25, 2026 21:55

deslop

06cc05b

markovejnovic added 14 commits June 25, 2026 22:04

fix(scheduler): log why a candidate refused placement

2ac546a

place/3 reduced every candidate failure into a blanket {:error, :no_capacity}, so a real boot error on the only candidate was indistinguishable from genuine lack of capacity. Log the actual refusal reason at each candidate.

feat(suidhelper): add jailer subcommand that execs the jailer as root

8f671a3

fix(suidhelper): fail closed on close_range error; _exit on jailer ex…

a5e098c

…ec failure

markovejnovic added 4 commits June 26, 2026 01:51

markovejnovic force-pushed the chore/get-a-vm-running branch from c9ebaaf to 4196153 Compare June 26, 2026 02:45

markovejnovic added 2 commits June 26, 2026 05:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Get a VM running#31

chore: Get a VM running#31
markovejnovic wants to merge 34 commits into
mainfrom
chore/get-a-vm-running

markovejnovic commented Jun 25, 2026

Uh oh!

codecov Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

markovejnovic commented Jun 25, 2026

Uh oh!

codecov Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 25, 2026 •

edited

Loading

github-actions Bot commented Jun 25, 2026 •

edited

Loading