Docs / Runners

Operating your runner fleet

The quickstart gets one runner online. This page is the rest of its life: organizing a fleet with groups and labels, handing packs their credentials, updating, troubleshooting, and removing a host cleanly. (Installation itself is covered in the quickstart.)

Groups and labels

Every runner declares a group (one string — cassandra-prod, web) and free-form labels (role=db, region=us-east-1) in /etc/emisar/config.yaml. Groups are load-bearing: runbook steps target them, per-member runner scopes follow them, and an LLM fan-out call can address every runner in one. Pick group names the way you'd name a fleet tier, not a host. The installer defaults the group to the hostname — fine for one box, worth changing the moment you have a second.

Enrollment keys

  • Single-use, shown once. Each emkey-auth-… key enrolls exactly one runner; concurrent attempts with the same key can't both win. The runner trades it for its own long-lived token (rnrtok-…, stored hashed cloud-side, mode-0600 on the host) and the key is spent.
  • Mint per host, at install time. Runners → Connect a runner generates the install one-liner with a fresh key baked in. For fleet provisioning (cloud-init, Packer, Ansible), mint a key per host and inject it as EMISAR_AUTH_KEY.
  • Revocation is cloud-side. Revoke a runner's token (or disable the runner) and its next frame gets a 401; the service exits rather than retrying forever.

Giving packs their credentials

Packs that talk to a service — Nomad, Consul, Postgres — read credentials from the runner's environment, never from call arguments, so secrets never transit the cloud. Two steps on the host:

# 1. /etc/emisar/runner.env (mode 0600) — the values
NOMAD_ADDR=http://127.0.0.1:4646
NOMAD_TOKEN=<acl-token>

# 2. /etc/emisar/config.yaml — allowlist the names
execution:
  inherit_env:
    - NOMAD_ADDR
    - NOMAD_TOKEN

Only allowlisted names reach an action's process (on top of the PATH/locale baseline) — the runner's own environment and its token never leak through. Restart the service after editing; what each pack needs is in emisar pack info <id>.

Online, offline, and stuck runs

The runner heartbeats every 30 seconds; the dashboard's online badge reflects it, and either side closes a connection that goes quiet. Disconnects are normal — the runner reconnects with exponential backoff and re-advertises its catalog, and in-flight actions keep executing through the gap with their results delivered on the new connection. If a runner stays offline past a two-minute grace, its in-flight runs are marked as errored with an explanation — nothing sits "running" forever.

Updating

# runner binary — re-run the installer; configs and packs are preserved
$ curl -sSL https://emisar.dev/install.sh | sudo bash

# packs — update from the registry, then reload (no restart, no dropped runs)
$ sudo emisar pack update linux-core
$ sudo systemctl reload emisar

Reload (SIGHUP) re-reads packs and re-advertises the catalog on the live connection. A changed pack hash lands as pending on the Packs page until someone trusts it — updates are visible, never silent.

The host-side toolbox

$ journalctl -u emisar -f          # service logs
$ sudo emisar action list           # what this host advertises
$ sudo emisar pack list             # loaded packs + hashes
$ sudo emisar events tail -f        # the local JSONL journal, live
$ sudo emisar audit verify --all    # verify the journal's hash chain

(The CLI defaults to --config /etc/emisar/config.yaml, so these work as-is on an installed host.) For defense-in-depth controls on the host — the local admission allowlist, systemd hardening drop-ins, granting specific actions elevated privileges via polkit/sudoers — see the security model .

Troubleshooting

  • Never shows up: journalctl -u emisar -f. A 401 means a spent or revoked enrollment key — mint a fresh one. A connect timeout means outbound 443 to emisar.dev is blocked.
  • Service is "failed": after five crashes in five minutes systemd stops retrying (deliberate — a revoked key shouldn't hammer the cloud). Fix the cause, then systemctl reset-failed emisar && systemctl start emisar.
  • Action won't dispatch: check the Packs page first — a pending (untrusted) or drifted pack hash blocks dispatch by design. Then check the member's or key's runner scope.
  • Pack refuses to load: the journal names the exact file and rule — duplicate action IDs, a script escaping the pack root, and symlinks (without allow_symlinks) all fail closed. emisar pack validate ./pack reproduces it locally.

Removing a runner

# on the host — remove binary + service (keep config/data/logs)
$ sudo bash install.sh --uninstall
# …or purge everything, including the local journal
$ sudo bash install.sh --uninstall --purge

Then delete the runner in the dashboard. Its run history and audit events stay — history survives the host. --purge deletes the local journal too, so export it first if you keep host-side forensics.