Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
d5bffcc
feat(suidhelper): mix suidhelper.install wrapping cargo xtask install
markovejnovic Jun 25, 2026
9201730
docs: document full node requirements (postgres, dm targets, kvm, cgr…
markovejnovic Jun 25, 2026
b68dd5e
docs: replace em-dashes with ASCII hyphens in intro.md
markovejnovic Jun 25, 2026
e2253f6
fix(suidhelper): make mix suidhelper.install tty-safe
markovejnovic Jun 25, 2026
4ddcd4d
feat(suidhelper): source device binaries from config, drop caller --bin
markovejnovic Jun 25, 2026
f340571
docs: Postgres quickstart, optional helper config, install caveats
markovejnovic Jun 25, 2026
23a5883
docs: how to load device-mapper modules for the dm targets
markovejnovic Jun 25, 2026
b0c0b5f
docs: how to create the parent cgroup + delegate controllers
markovejnovic Jun 25, 2026
5761a4e
fix(node): start Layer before Budget.Supervisor
markovejnovic Jun 25, 2026
8dc61d3
fix: quiet benign ThinPool port exits and keyless OTLP export
markovejnovic Jun 25, 2026
1320700
fix(thin_pool): reclaim a stale dm pool on init
markovejnovic Jun 25, 2026
6ef56d2
fix(node): validate OCI loader tools at boot
markovejnovic Jun 25, 2026
06cc05b
deslop
markovejnovic Jun 25, 2026
c0685ae
fix: ignore benign port exits in the remaining trap_exit servers
markovejnovic Jun 25, 2026
7fa0480
fix(fire_vmm): child_spec key must be :id, not :vm_id
markovejnovic Jun 25, 2026
2ac546a
fix(scheduler): log why a candidate refused placement
markovejnovic Jun 25, 2026
68c8d4b
feat(node): reclaim orphaned dm/loop devices at boot
markovejnovic Jun 25, 2026
b6ce604
feat(suidhelper): add firecracker/jailer/uid_gid_range to config
markovejnovic Jun 25, 2026
8f671a3
feat(suidhelper): add jailer subcommand that execs the jailer as root
markovejnovic Jun 25, 2026
a5e098c
fix(suidhelper): fail closed on close_range error; _exit on jailer ex…
markovejnovic Jun 25, 2026
5f9f36b
refactor(fire_vmm): drive jailer through suidhelper; drop Provider
markovejnovic Jun 25, 2026
3a58b77
feat(mix): add firecracker.install task to fetch+configure firecracke…
markovejnovic Jun 25, 2026
a8743fc
docs: firecracker/jailer are operator-installed via mix firecracker.i…
markovejnovic Jun 26, 2026
5770f57
fix(config): node and helper read the same firecracker/jailer TOML keys
markovejnovic Jun 26, 2026
ffd46a4
feat(mix): firecracker.install prints the chown/chmod root commands
markovejnovic Jun 26, 2026
202141b
feat(fire_vmm): log jailer/firecracker output and real exit status
markovejnovic Jun 26, 2026
0ecccf5
feat(fire_vmm): surface the API readiness-probe failure reason
markovejnovic Jun 26, 2026
2d48872
feat(suidhelper): add chroot-jail grant-api to hand the API socket to…
markovejnovic Jun 26, 2026
85bc4a4
feat(fire_vmm): grant the API socket before probing so the controller…
markovejnovic Jun 26, 2026
71d6440
fix(hyper): generate alphanumeric vm ids (firecracker rejects _)
markovejnovic Jun 26, 2026
b3c2315
fix(fire_vmm): self-register per-VM names in init, not via a start name
markovejnovic Jun 26, 2026
4196153
fix(suidhelper): grant-api opens the jail root dir for node traversal
markovejnovic Jun 26, 2026
5284878
fix(suidhelper): resolve dm symlink to real node before rootfs open
markovejnovic Jun 26, 2026
f2dc209
feat(fire_vmm): log boot failures in the Configuring state
markovejnovic Jun 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 21 additions & 11 deletions config/runtime.exs
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,27 @@ config :hyper, Hyper.Node.Config.Budget,
# Where to send traces. Defaults to Honeycomb; override OTEL_EXPORTER_OTLP_*
# to point at any OTLP/HTTP backend (Collector, Grafana, etc).
if config_env() != :test do
endpoint = System.get_env("OTEL_EXPORTER_OTLP_ENDPOINT", "https://api.honeycomb.io")
custom_endpoint = System.get_env("OTEL_EXPORTER_OTLP_ENDPOINT")
api_key = System.get_env("HONEYCOMB_API_KEY")

headers =
case System.get_env("HONEYCOMB_API_KEY") do
nil -> []
"" -> []
key -> [{"x-honeycomb-team", key}]
end
cond do
api_key not in [nil, ""] ->
config :opentelemetry_exporter,
otlp_protocol: :http_protobuf,
otlp_endpoint: custom_endpoint || "https://api.honeycomb.io",
otlp_headers: [{"x-honeycomb-team", api_key}]

config :opentelemetry_exporter,
otlp_protocol: :http_protobuf,
otlp_endpoint: endpoint,
otlp_headers: headers
custom_endpoint not in [nil, ""] ->
# A custom OTLP backend (e.g. a local Collector) needs no Honeycomb key.
config :opentelemetry_exporter,
otlp_protocol: :http_protobuf,
otlp_endpoint: custom_endpoint,
otlp_headers: []

true ->
# No backend configured: exporting to the Honeycomb default with no key
# 401s on every batch. Stay silent instead (typical for local dev). Set
# HONEYCOMB_API_KEY or OTEL_EXPORTER_OTLP_ENDPOINT to enable tracing.
config :opentelemetry, traces_exporter: :none
end
end
196 changes: 185 additions & 11 deletions docs/cookbook/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,187 @@ The absolute best way to get started with `Hyper` is to play with it.

### Requirements

Hyper requires the following software be installed on each node running it:
#### External services

- [`skopeo`](https://github.com/containers/skopeo)
- [`e2fsprogs`](https://github.com/tytso/e2fsprogs)
Hyper needs a **PostgreSQL** server reachable from every node - it is the image
database and the only stateful external dependency.

Hyper has more runtime dependencies, but they are automatically redistributed
by Hyper.
For local development the quickest path is Docker. The connection details below
match the defaults in `config/config.exs` (`Hyper.Img.Db.Repo`):

```sh
docker run -d --name hyper-pg \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=hyper_dev \
-p 5432:5432 \
postgres:16
```

Once it is up, create and migrate the schema (the repo is not in `ecto_repos`,
so pass it with `-r`):

```sh
mix ecto.create -r Hyper.Img.Db.Repo
mix ecto.migrate -r Hyper.Img.Db.Repo
```

The container is ephemeral; `docker start hyper-pg` brings it back after a
reboot. To point Hyper at an existing server instead, override the
`Hyper.Img.Db.Repo` block in your `config.exs`.

#### System binaries

These are used by the unprivileged node directly; each must be on the node's
`PATH` (the bracketed override is the `config :hyper` key you can set if the
binary lives elsewhere):

- [`skopeo`](https://github.com/containers/skopeo) - pulls OCI images
(`skopeo_path`)
- [`e2fsprogs`](https://github.com/tytso/e2fsprogs) - provides `mke2fs`, which
builds the ext4 rootfs (`mke2fs_path`)
- `du`, `getent` (from **coreutils** and **glibc**) - rootfs sizing and user
resolution. Present on essentially every distro.

The privileged device binaries - `losetup`, `blockdev` (from **util-linux**)
and `dmsetup` (from **lvm2** / device-mapper) - are run only by the setuid
helper, never named by the unprivileged caller. Their paths therefore live in
the helper's own config, `/etc/hyper/config.toml`, and default to
`/usr/sbin/{losetup,blockdev,dmsetup}`.

**The config file must exist** to set `firecracker` and `jailer` (no built-in
defaults for those). The device-tool paths (`dmsetup`, `losetup`, `blockdev`)
and `work_dir` do have built-in defaults, so if you only need those defaults
and are not running VMs you may omit the file entirely. When the file is
present it must be root-owned and not group/other-writable, or the helper
refuses to start (a present-but-untrusted file is treated as an attack signal,
unlike a missing one):

```toml
# /etc/hyper/config.toml (root-owned, mode 0644)
work_dir = "/srv/hyper"

# REQUIRED - no default. Each must be an absolute path to a root-owned,
# non-group/world-writable binary named exactly "firecracker" or "jailer"
# (the helper validates the basename). Run `mix firecracker.install` to
# download the pinned release and print these values.
firecracker = "/opt/firecracker/firecracker"
jailer = "/opt/firecracker/jailer"

# Optional device-tool overrides; default to /usr/sbin/{dmsetup,losetup,blockdev}.
# Each must be root-owned and not group/world-writable.
dmsetup = "/usr/sbin/dmsetup"
losetup = "/usr/sbin/losetup"
blockdev = "/usr/sbin/blockdev"

# Optional. Governs which uid/gid values the helper accepts when launching the
# jailer. Must satisfy min > 0 and min <= max. Defaults to {900000, 999999}.
# If you narrow this range, set the same bounds in `config :hyper, uid_gid_range:`
# so the node hands out only uids the helper will accept.
[uid_gid_range]
min = 900000
max = 999999
```

`dmsetup` (lvm2) is frequently *not* installed by default - check that one
first.

#### Kernel features

The host kernel must provide:

- **KVM** - `/dev/kvm` must exist and be accessible to the per-VM users (see
the `uid_gid_range` configuration).
- **cgroup v2** - the unified hierarchy mounted at `/sys/fs/cgroup`. v1-only
hosts are not supported.
- **device-mapper targets** `snapshot`, `thin`, and `thin-pool` - from the
`dm_snapshot` (provides `snapshot`) and `dm_thin_pool` (provides `thin` and
`thin-pool`) modules. Hyper refuses to start without all three; on boot it
fails with `{:missing_dm_targets, [...]}` listing whichever are absent.
- **loop devices** - the `loop` module, used to attach layer images as block
devices.

Load the modules and confirm the targets are present:

```sh
sudo modprobe dm_snapshot dm_thin_pool loop
sudo dmsetup targets # must list snapshot, thin, and thin-pool
```

If `modprobe` reports the module is missing, the running kernel lacks it -
minimal cloud images often strip device-mapper. On Debian/Ubuntu, install the
extra modules for the running kernel, then load them:

```sh
sudo apt-get install -y linux-modules-extra-$(uname -r)
sudo modprobe dm_snapshot dm_thin_pool loop
```

Make the modules load on every boot:

```sh
printf 'dm_snapshot\ndm_thin_pool\nloop\n' | sudo tee /etc/modules-load.d/hyper.conf
```

#### Privileged setup

- The **setuid-root device helper** (`hyper-suidhelper`) must be installed.
Run `mix suidhelper.install`, which builds, stamps, and places it
setuid-root on `PATH`. Every privileged operation (losetup, dmsetup, mknod,
chroot jails) routes through it; the BEAM itself runs unprivileged.

The final `sudo install` step runs without a controlling terminal (Mix
captures the nested `cargo` output), so on a typical `tty_tickets` sudo
setup it cannot prompt for a password. If it fails, the build has already
stamped the binary -- just run the copy yourself:

```sh
sudo install -o root -g root -m 4755 \
native/suidhelper/target/release/hyper-suidhelper \
/usr/local/bin/hyper-suidhelper
```
- A **parent cgroup** named by `cgroup_parent` (default `hyper`) must exist
under the cgroup-v2 hierarchy; Hyper creates each VM's cgroup beneath it and
fails to boot with `:missing_parent_cgroup` if it is absent. Create it and
delegate the `cpu` and `memory` controllers so the per-VM cgroups can set
`cpu.max` / `memory.max`:

```sh
sudo mkdir -p /sys/fs/cgroup/hyper
echo '+cpu +memory' | sudo tee /sys/fs/cgroup/hyper/cgroup.subtree_control
```

If that last write errors, the root hierarchy is not delegating those
controllers down yet - enable them there first, then retry the line above:

```sh
echo '+cpu +memory' | sudo tee /sys/fs/cgroup/cgroup.subtree_control
```

The cgroup hierarchy is memory-backed, so `/sys/fs/cgroup/hyper` does **not**
survive a reboot. Re-create it each boot, or persist it with
`systemd-tmpfiles`:

```sh
echo 'd /sys/fs/cgroup/hyper 0755 root root -' \
| sudo tee /etc/tmpfiles.d/hyper-cgroup.conf
```
- The host UID/GID range must be free for Hyper to allocate per-VM users
from. The node's range is set by `uid_gid_range` in `config :hyper`; the
helper independently reads `[uid_gid_range]` from `/etc/hyper/config.toml`
(see below) and only accepts jailer `--uid`/`--gid` within that range.
Keep the two in sync.

#### Auto-redistributed

`umoci` and the guest `vmlinux` kernels are downloaded, checksum-verified, and
managed by Hyper itself; you do not install them.

`firecracker` and `jailer` are not auto-downloaded. Install them with
`mix firecracker.install [--prefix <dir>]` (default prefix `/opt/firecracker`),
which downloads the pinned v1.16.0 release, places the binaries at
`<prefix>/firecracker` and `<prefix>/jailer`, and prints the `/etc/hyper/config.toml`
snippet to paste in.

### Installation

Expand All @@ -40,22 +214,22 @@ configuration.

```elixir
config :hyper,
# TODO(markovejnovic): Remove this after it gets auto-downloaded.
jailer_bin: "/opt/firecracker/jailer-v1.16.0-x86_64",
# TODO(markovejnovic): Remove this after it gets auto-downloaded.
firecracker_bin: "/opt/firecracker/firecracker-v1.16.0-x86_64",
# You must create a parent cgroup on your system. Continue reading for
# further details.
cgroup_parent: "hyper",
# TODO(markovejnovic): Merge these directories into one.
jailer_chroot_base: "/srv/hyper/jails",
socket_dir: "/srv/hyper/socks",
scratch_dir: "/srv/hyper/scratch",
# Hyper requires that each VM you pass
# Must match the [uid_gid_range] table in /etc/hyper/config.toml so the node
# hands out only uids the helper will accept.
uid_gid_range: {900_000, 999_999},
layer_dir: "/srv/hyper/layers"
```

The `firecracker` and `jailer` binary paths are **not** set here — they are read
from `/etc/hyper/config.toml` (the single source of truth shared with the setuid
helper). See the `config.toml` example above.

<!-- TODO(markovejnovic): Update the config section. -->

### Usage
Expand Down
19 changes: 17 additions & 2 deletions lib/hyper.ex
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,24 @@ defmodule Hyper do
end
end

@doc "Generate a fresh VM id (url-safe base64, dm-name compatible)."
@doc """
Generate a fresh VM id: a `v` prefix followed by lowercase base32 of 10 random
bytes, charset `[a-z2-7]`.

Alphanumeric only - no `-`, `_`, or other punctuation. That is the intersection
of three independent constraints the id must satisfy at once:

* firecracker rejects `_` in an instance id (`InvalidInstanceId`);
* dm/jailer names must not start with `-`;
* registry keys and chroot path components stay trivially safe.

The previous base64url encoding emitted `-` and `_`, so it could produce ids
firecracker refused at boot (`Invalid char (_)`).
"""
@spec gen_vm_id() :: Hyper.Vm.id()
def gen_vm_id, do: Base.url_encode64(:crypto.strong_rand_bytes(9), padding: false)
def gen_vm_id do
"v" <> Base.encode32(:crypto.strong_rand_bytes(10), padding: false, case: :lower)
end

@spec resolve_arch(Hyper.Vm.Instance.arch() | nil) ::
{:ok, Hyper.Vm.Instance.arch()} | {:error, term()}
Expand Down
20 changes: 20 additions & 0 deletions lib/hyper/cluster/routing.ex
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,26 @@ defmodule Hyper.Cluster.Routing do
@spec via(term()) :: {:via, module(), {atom(), term()}}
def via(key), do: {:via, Horde.Registry, {@name, key}}

@doc """
Register the calling process under `key` from inside its own `init`.

Prefer this over starting a process with a `{:via, Horde.Registry, _}` name.
OTP's post-start name check (`gen:get_proc_name`) calls `whereis_name`
immediately after the synchronous `register`, but Horde materialises the name
into its local ETS only asynchronously, via the DeltaCRDT diff loop. Under
registry churn that read loses the race and OTP aborts startup with
`{:process_not_registered_via, Horde.Registry}`. Registering from within
`init` carries no such self-check, while leaving the name cluster-resolvable
through `via/1` once the diff propagates (callers already tolerate that lag).
"""
@spec register_self(term()) :: :ok | {:error, {:already_registered, pid()}}
def register_self(key) do
case Horde.Registry.register(@name, key, nil) do
{:ok, _pid} -> :ok
{:error, {:already_registered, _pid}} = err -> err
end
end

@doc "Which node currently runs `vm_id`? `nil` if unknown."
@spec whereis(Hyper.Vm.id()) :: node() | nil
@decorate with_span("Hyper.Cluster.Routing.whereis", include: [:vm_id])
Expand Down
13 changes: 11 additions & 2 deletions lib/hyper/cluster/scheduler.ex
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ defmodule Hyper.Cluster.Scheduler do
alias Hyper.Vm.Instance.Spec
alias Unit.Information

require Logger

use OpenTelemetryDecorator

@type layer_sizes :: [{Hyper.Layer.id(), Unit.Information.t()}]
Expand Down Expand Up @@ -45,8 +47,15 @@ defmodule Hyper.Cluster.Scheduler do
|> candidates(layers)
|> Enum.reduce_while({:error, :no_capacity}, fn node, acc ->
case attempt.(node) do
{:ok, result} -> {:halt, {:ok, {node, result}}}
{:error, _reason} -> {:cont, acc}
{:ok, result} ->
{:halt, {:ok, {node, result}}}

{:error, reason} ->
# The candidate fit the snapshot but refused at confirmation time.
# Log the real reason: otherwise an actual boot failure on the only
# candidate is indistinguishable from genuine `:no_capacity`.
Logger.warning("scheduler: #{inspect(node)} refused placement: #{inspect(reason)}")
{:cont, acc}
end
end)
end
Expand Down
Loading
Loading