Kai coilysiren

$ ssh kai@kai-server

┌────────────────────────────────────┐
│ kai-server · lights-out            │
│ uptime: ten years counting         │
│ operator: kai siren · east bay, ca │
└────────────────────────────────────┘

● platform.target
     Active: active (running)
     Status: "27 devices · 48 pods
              · 2 ✗ · agents on shift"

⚙⚒ agents on the line ⚒⚙

`> whoami`

Hi! I'm Kai. Platform engineer, 10+ years in. Day job: accelerating engineers as their work goes agentic, with observability for LLM consumers as the current bet. Off-hours I run a small lights-out factory: single-node k3s homelab, a herd of agents building and breaking my own services in the dark, a steady output of small tools. Wire it in, instrument it, push on it until it breaks.

Most excited about Gauntlet: a two-agent adversarial loop that infers software correctness under sustained, targeted attack. - /now

`> lights_out`

The factory framing is not a bit. The goal is a dark factory: code written by agents, verified by attack, shipped while I sleep. The pieces that make that safe instead of terrifying:

A security boundary first. Agents on this fleet route privileged operations through coily, an escape-hatch-resistant CLI wrapper. Every privileged call lands in an audit log. The interesting design constraint is that the boundary must hold against the agent operating inside it, which rules out most of the obvious implementations.
Verification by adversary, not by vibes. gauntlet runs a two-role loop, an attacker and an inspector, against a running service and infers correctness from how the service behaves under sustained attack. Built for the case where a human never reads the diff.
Observability over the whole substrate. repo-recall joins OTel spans, git state, and Claude Code sessions into one queryable surface. session-lattice maintains incremental views over it. Agent-to-agent traffic rides otel-a2a-relay, so even the agents talking to each other shows up as spans.

When the line breaks, the agents file the issue. When it breaks badly, see the power strip in the tailnet section below.

`> production_floor`

The floor is organized into three bays. Two starting points if you're browsing: gauntlet is the thesis in code, and coily is the hard design problem. If you want to click something that runs right now, the galaxy sim is live at galaxy-gen.coilysiren.me.

	coilyco-flight-deck - the flight deck, where the builds launch. The flagship is gauntlet `RUNNING HOT`, the two-agent adversarial loop from the thesis above: point it at a running service and it infers correctness from how the service holds up under sustained, targeted attack. Feeding it context is the observability substrate: repo-recall `ACTIVE` indexes every Claude Code session on the fleet, and session-lattice `SCAFFOLDED` keeps incremental materialized views over that data (Feldera, DBSP) for luca to answer questions with. infrastructure `OPERATIONAL` is the factory floor everything else stands on - the single-node k3s cluster, GH Actions deploys, SSM-backed secrets, Tailscale. And for something with no agents in it at all, galaxy-gen `LIVE` draws procedural galaxies in Rust-compiled-to-WASM at galaxy-gen.coilysiren.me.
	coilyco-bridge - the bridge, where the controls live. coily `ACTIVE` is the security boundary the whole lights-out bet rests on: an escape-hatch-resistant CLI wrapper that privileged operations route through, audit-logging every call. The design constraint that makes it interesting is that the boundary has to hold against the agent operating inside it, which rules out most of the obvious implementations. Its neighbor eco-cycle-prep `ACTIVE` runs the automation that stands up each new Eco server cycle.
	coilyco-gaming - the gaming bay, newest on the floor. Everything for the Eco via Sirens game server lives here. eco-app is the companion-services monorepo - the MCP server Claude Desktop talks to, the player-professions dashboard, the replay browser, and the telemetry mod, four former repos fused into one deployable. eco-mods carries the C# gameplay mods that run inside the server itself.
	coilysiren - the operator's own bay. The personal namespace: this profile you're reading, and the site at coilysiren.me, where the resume and the /now page live.

`> shift_report`

role:     Senior Platform Engineer
employer: Kapwing
shift:    lights-out
fleet:    27 devices · 1 tailnet

specialties:
  - platform / SRE
  - AI agents + MCP
  - observability
  - adversarial testing

prior_art:
  - urfave/cli maintainer
  - HHS gov site @ Nava
  - DevOps EM @ EnergyHub
  - BGP VPN @ Textio
  - Crypto product @ Callisto

`> tailnet`

Everything above is claims. From here down, receipts.

"Homelab" undersells it. The fleet is joined by Tailscale into a single tailnet across two physical sites, and the device list is most of the story: the machines, the phones, the WSL guests, and every k3s service that publishes itself onto the mesh as its own node.

site 1 · east bay
├─ kai-server         k3s · always-on
├─ kai-tower-3026     3090 ti · llm
├─ kai-desktop-tower  rtx 2080 · dark
└─ kasa hs300         hard-reset path

site 2
└─ ser8               warm standby · DR

roaming
├─ kais-macbook-pro
├─ kai-windows-laptop
└─ pixel-9

ephemeral
└─ gha runners · wsl · k8s proxies

`> fleet_inventory`

Node	Notes
kai-server	Intel i7-14700, 32 GB, no dGPU. The always-on box: single-node k3s running every personal service, plus game servers (Eco, Factorio, Icarus, Core Keeper). The only machine allowed to hold state.
kai-tower-3026	Brand new AM5 build: Ryzen 9 9950X3D, 64 GB DDR5, RTX 3090 Ti 24 GB. Daily driver and heavy LLM machine one of two.
kai-desktop-tower	The previous tower, i7-8700 with an RTX 2080. Heavy LLM machine two of two, currently dark: the new build is borrowing its power cable. Showing `○ offline` above until a second cable arrives.
kai-windows-laptop	i7-11800H, 16 GB, RTX 3060 mobile. Travel Windows host, burst inference when open.
kais-macbook-pro	Apple Silicon. Travel default, where most Claude Code sessions originate. Runs a local Qwen 9B (MLX) through Ollama with OpenCode pointed at it, scoped to trivial tasks.
ser8	Beelink SER8, Ryzen 7 PRO 8845HS, 64 GB. Cross-site warm standby for the k3s control plane. Separate power, ISP, and site, which is what makes the DR story real.

Footnotes: a worker-only Radxa Zero 3W appears in the standby topology but is unfit to hold state (WiFi plus SD card, no thanks), and a Kasa HS300 smart power strip feeds the site-1 fleet as the hard-power-cycle path of last resort. When software observability fails, there is always the physical layer.

`> tailscale_status`

The live mesh, regenerated by scripts/fleet-readout.sh. Hostnames real, everything opaque redacted, third-party devices excluded.

$ tailscale status
  ● kais-macbook-pro             macos
  ● api                          linux
  ● backend-db                   linux
  ○ coilysiren-backend-coilysir… linux
  ● coilysiren-eco-mcp-app-coil… linux
  ● coilysiren-eco-spec-tracker… linux
  ○ coilysiren-galaxy-gen-coily… linux
  ● forgejo-1                    linux
  ○ forgejo                      linux
  ● galaxy-gen                   linux
  ○ kai-desktop-tower-wsl        linux
  ○ kai-desktop-tower            windows
  ○ kai-mac-kapwing              macos
  ○ kai-macbook-pro-vm           linux
  ● kai-server                   linux
  ● kai-tower-3026-wsl           linux
  ● kai-tower-3026               windows
  ○ kai-windows-laptop           windows
  ○ kais-macbook-pro-1           macos
  ● ntfy                         linux
  ○ observability-vmsingle-tail… linux
  ● pixel-9                      android
  ● repo-recall                  linux
  ● ser8                         linux
  ● signoz                       linux
  ● tailscale-operator           linux
  ● vmsingle                     linux
  27 devices · 1 tailnet · 2 sites

Yes, the phone is a tailnet node. Yes, the Forgejo instance, the notification daemon, and the metrics store are each their own device. The Tailscale operator publishes k3s services onto the mesh, so the cluster's insides show up on the device list like roommates.

`> kubectl_get_pods`

The same factory from the cluster's point of view, same redaction rules (hash suffixes are opaque ids, so they drop).

$ kubectl get pods -A
  cert-manager/
    ● cert-manager
    ● cert-manager-cainjector
    ● cert-manager-webhook
  coilysiren-backend/
    ● coilysiren-backend-app
    ● coilysiren-backend-db
  coilysiren-eco-mcp-app/
    ● coilysiren-eco-mcp-app-app
  coilysiren-eco-spec-tracker/
    ● coilysiren-eco-spec-tracker-a…
  coilysiren-galaxy-gen/
    ● coilysiren-galaxy-gen-app
  default/
    ● null-db
  external-secrets/
    ● external-secrets
    ● external-secrets-cert-control…
    ● external-secrets-webhook
  forgejo/
    ● forgejo-db
    ● forgejo
    ◌ forgejo-runner
    ✗ forgejo-runner-tap-writer
    ● ts-forgejo
  kube-system/
    ● coredns
    ✓ helm-install-traefik-crd
    ✓ helm-install-traefik
    ● local-path-provisioner
    ● metrics-server
    ● svclb-traefik ×3
    ● traefik
  lunch-money/
    ● lunch-money-lunch-money-k8s
  ntfy/
    ● ntfy
  observability/
    ● chi-signoz-clickhouse-cluster
    ● grafana
    ✗ node-exporter-prometheus-node…
    ● node-exporter-prometheus-node… ×2
    ● signoz
    ● signoz-clickhouse-operator
    ● signoz-otel-collector
    ✓ signoz-telemetrystore-migrator
    ● signoz-zookeeper
    ● ts-signoz
    ● ts-vmsingle
    ● victoria-metrics-victoria-met…
    ● vmagent-victoria-metrics-agent
  openclaw/
    ◌ openclaw
  registry/
    ● registry
  repo-recall/
    ● repo-recall
  tailscale/
    ● operator
    ● ts-coilysiren-eco-mcp-app-ser…
    ● ts-coilysiren-eco-spec-tracke…
  48 pods · 16 namespaces · 1/3 nodes

The ✗ marks are real. So is the 1/3 nodes: two joined workers (the WSL guest and a Mac VM from the tailnet list above) sit NotReady while kai-server carries everything. A lights-out factory that only ever shows green is lying to you.

`> local_llm_modes`

The fleet maps onto a three-mode local-model plan:

Mode 1 (burst) - the dGPU machines, when they happen to be on and plugged in. The new tower's 3090 Ti is the workhorse, the old tower's 2080 rejoins the line once it gets its power cable back, and the laptop's 3060 pitches in.
Mode 2 (always-on) - kai-server orchestrates, calls into a tower GPU over the tailnet when reachable, falls back to CPU-only inference or an API otherwise. CPU-only on the i7-14700 is real but humble.
Mode 3 (api) - frontier models over the wire for everything that deserves them.

And one edge case: the Mac keeps a Qwen 9B warm through Ollama + OpenCode, scoped to trivial tasks only. Everything bigger escalates up the modes.

`> stack`

Python, Go, TypeScript, Bash, C#. AWS, Kubernetes (k3s), Terraform, Docker, Tailscale. Prometheus, Grafana, Sentry, OpenTelemetry. Claude Code, MCP.

`> service_history`

2025-now   Kapwing    Senior SWE
2023-2025  Nava       Principal Infra
2022-2023  Textio     Staff Infra
2021-2022  EnergyHub  DevOps EM
2020-2021  Bluelink   Senior Backend
2018-2020  Textio     Senior Infra
2016-2018  Callisto   Senior SWE

Older: Harlot, Quirell/CollectQT, NASA Goddard. Full résumé: coilysiren.me/resume. What I'm doing right now: coilysiren.me/now.

`> faq`

Why does a profile README have a network diagram and a pod listing? Because this repo is the one place in the fleet with no size cap, no managed hooks, and no validators. Every other repo I own answers to a pre-commit suite rolled out from a central baseline. This one carries an exemption marker and does what it wants. Naturally it became the long-form surface.

Are the readouts real? Yes. They're generated by scripts/fleet-readout.sh against the live tailnet and cluster, then pasted in. The redaction is the interesting part: tailnet IPs, FQDNs, account labels, pod hash suffixes, and other people's devices are all stripped before anything lands in git, because opaque identifiers stay out of tracked files on principle. The systemd unit in the banner is aspirational - the numbers in its status line are not.

`> comms`

coilysiren.me · Bluesky · X · LinkedIn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kai coilysiren

Achievements