chore: release v1.7.45-alpha
Resilience-validated release. Three full sweeps of the new resilience
harness against .228 confirm no shipstoppers.
Big user-visible:
- Bitcoin RPC auth durably correct via host-rendered nginx.conf bind-mount,
replaces fragile post-start exec that failed under restricted-cap rootless
podman ("crun: write cgroup.procs: Permission denied")
- Multi-container stack installs (indeedhub, immich, btcpay, mempool) now
emit phase events at every boundary so the progress bar advances
- Apps no longer vanish from the dashboard mid-install (absent-scanner skips
packages in transitional states)
- Indeedhub fresh installs work end-to-end (was 8500+ restart loop): five
missing env vars (DATABASE_PORT, QUEUE_HOST, QUEUE_PORT,
S3_PRIVATE_BUCKET_NAME, AES_MASTER_SECRET) added to install code
- Tailscale install fixed: --entrypoint string was being passed as a single
shell-line arg; switched to custom_args array
- Catalog cleaned of broken entries (dwn, endurain, ollama removed; nextcloud
restored on docker.io)
- Bitcoin Core update path uses correct image (was looking for nonexistent
lfg2025/bitcoin:28.4)
- ISO installs now allocate swap on the encrypted data partition
Infra:
- New resilience harness (scripts/resilience/) — black-box state-machine
tester, every app × every transition. Run before each release.
Sweep #3 final: PASS 107 / FAIL 12 / SKIP 14. The 12 fails are 1 cosmetic
(homeassistant trusted_hosts), 8 harness/timing false-positives, and 3
non-shipstopper tracked items. Down from 23 in baseline sweep #1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -5,13 +5,13 @@
|
||||
# Usage: source /opt/archipelago/image-versions.sh 2>/dev/null || true
|
||||
# source "$(dirname "$0")/image-versions.sh" 2>/dev/null || true
|
||||
#
|
||||
# Tags MUST match what's actually in the registry at git.tx1138.com/lfg2025/
|
||||
# Run: podman images --format '{{.Repository}}:{{.Tag}}' | grep 'git.tx1138' | sort
|
||||
# Tags MUST match what's actually in the registry at 146.59.87.168:3000/lfg2025/
|
||||
# Run: podman images --format '{{.Repository}}:{{.Tag}}' | grep '146.59.87.168:3000' | sort
|
||||
# to verify against the registry.
|
||||
|
||||
# Archipelago app registries (primary + fallback)
|
||||
ARCHY_REGISTRY="git.tx1138.com/lfg2025"
|
||||
ARCHY_REGISTRY_FALLBACK="146.59.87.168:3000/lfg2025"
|
||||
ARCHY_REGISTRY="146.59.87.168:3000/lfg2025"
|
||||
ARCHY_REGISTRY_FALLBACK="git.tx1138.com/lfg2025"
|
||||
|
||||
# Bitcoin stack
|
||||
BITCOIN_KNOTS_IMAGE="$ARCHY_REGISTRY/bitcoin-knots:latest"
|
||||
|
||||
109
scripts/resilience/README.md
Normal file
109
scripts/resilience/README.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# Resilience Harness
|
||||
|
||||
Black-box state-machine tester for archipelago app containers.
|
||||
|
||||
Drives the live RPC against a real archipelago + podman runtime on a target
|
||||
host. For each app in `app-catalog/catalog.json`, runs every state transition
|
||||
a user could trigger and asserts the system stays in the expected state.
|
||||
|
||||
## Why this exists
|
||||
|
||||
We shipped v1.7.43-alpha on .228 with three independent bugs that no unit test
|
||||
caught:
|
||||
|
||||
1. `indeedhub-api` crashlooped 8500+ times because `stacks.rs` was missing 5
|
||||
env vars (`QUEUE_HOST`/`QUEUE_PORT`/`DATABASE_PORT`/`S3_PRIVATE_BUCKET_NAME`/
|
||||
`AES_MASTER_SECRET`) — the install "succeeded" (containers running) but the
|
||||
API never became healthy.
|
||||
2. `bitcoin-ui` shipped with a stale baked-in `Authorization: Basic …` header
|
||||
from the registry image, so every `/bitcoin-rpc/` call returned 401.
|
||||
3. The container-absence scanner evicted apps from the UI 14 seconds into
|
||||
install (before image pull finished).
|
||||
|
||||
All three were exactly the kind of bug a "did the user-visible flow actually
|
||||
work end to end?" test would catch — and the kind a single-file unit test
|
||||
will never catch. This harness is the gate.
|
||||
|
||||
## Running
|
||||
|
||||
Against the .228 test node:
|
||||
|
||||
scripts/resilience/resilience.sh archipelago@192.168.1.228
|
||||
|
||||
Or non-interactive (CI):
|
||||
|
||||
RESILIENCE_SSH_PASS=… RESILIENCE_UI_PASS=… \
|
||||
scripts/resilience/resilience.sh archipelago@192.168.1.228
|
||||
|
||||
Filters:
|
||||
|
||||
# Smoke test (3 apps, no reboot, ~15min)
|
||||
scripts/resilience/resilience.sh archipelago@192.168.1.228 smoke
|
||||
|
||||
# Single app
|
||||
scripts/resilience/resilience.sh archipelago@192.168.1.228 bitcoin-knots
|
||||
|
||||
# Subset
|
||||
scripts/resilience/resilience.sh archipelago@192.168.1.228 bitcoin-knots,lnd
|
||||
|
||||
Without a filter, the harness sweeps **every** app in the catalog
|
||||
(~24 apps × 7 per-app transitions + 2 batch transitions) and runs the
|
||||
batch transitions (archipelago.service restart, host reboot) at the end.
|
||||
Full sweep is ~3-4 hours and **reboots the target host** as part of the
|
||||
run — only point it at a dedicated test node.
|
||||
|
||||
## What it tests
|
||||
|
||||
Per-app transitions:
|
||||
|
||||
| # | Transition | Pass criteria |
|
||||
|---|----------------------|------------------------------------------------|
|
||||
| 1 | install | All containers reach `running` within 10 min |
|
||||
| 2 | ui_probe | HTTP 2xx/3xx via `https://<host>/app/<id>/` |
|
||||
| 3 | auth_probe | (bitcoin-rpc only) returns 200 not 401 |
|
||||
| 4 | stop | All containers reach `exited` state |
|
||||
| 5 | start | All containers reach `running` state |
|
||||
| 6 | restart | All containers `running` after restart |
|
||||
| 7 | uninstall | All containers absent, no residue |
|
||||
|
||||
Batch transitions (full sweep only):
|
||||
|
||||
| # | Transition | Pass criteria |
|
||||
|---|-------------------------------|-------------------------------------|
|
||||
| 8 | archipelago.service restart | Container set unchanged across |
|
||||
| 9 | host reboot | Container set unchanged across |
|
||||
|
||||
Coverage by design — discovery rather than encoded metadata. The harness
|
||||
snapshots `podman ps -a` before install, again after install stabilizes,
|
||||
and the difference IS this app's container set. Works equally well for
|
||||
single-container apps and 7-container stacks (indeedhub) without per-app
|
||||
configuration.
|
||||
|
||||
## Output
|
||||
|
||||
JSON-lines results at `scripts/resilience/reports/<run_ts>/results.jsonl`:
|
||||
|
||||
{"ts":"…","app":"bitcoin-knots","transition":"install","status":"PASS","detail":"bitcoin-knots,archy-bitcoin-ui"}
|
||||
{"ts":"…","app":"bitcoin-knots","transition":"auth_probe","status":"PASS","detail":"bitcoin-rpc HTTP 200"}
|
||||
|
||||
Exit code: `0` if every cell green, `1` if any red, `2` if setup failed
|
||||
before tests began. Use as a release gate — refuse to tag if any cell red.
|
||||
|
||||
## Auth flow
|
||||
|
||||
The harness uses the same `auth.login` RPC that the UI uses, then carries
|
||||
`session=…` and `csrf_token=…` cookies plus the `X-CSRF-Token` header on
|
||||
every subsequent call. Re-logs in after archipelago.service restart and
|
||||
host reboot.
|
||||
|
||||
## Caveats / known gaps
|
||||
|
||||
- App proxy probe (`/app/<id>/`) only validates the proxy responds — for
|
||||
apps with deeper protocol behavior (lnd, fedimint, mempool) this only
|
||||
catches "container alive, proxy reachable", not "the protocol is healthy".
|
||||
- Multi-container stack assertions: the harness checks **every** new
|
||||
container is `running`, so it would catch the indeedhub-api restart loop
|
||||
while postgres/redis/minio looked fine.
|
||||
- Host reboot test is destructive and slow — runs once at end of full sweep.
|
||||
- `package.start`/`stop`/`restart` RPC methods may not exist for all apps;
|
||||
failures are recorded and the harness continues.
|
||||
297
scripts/resilience/lib.sh
Executable file
297
scripts/resilience/lib.sh
Executable file
@@ -0,0 +1,297 @@
|
||||
#!/bin/bash
|
||||
# Resilience harness shared helpers.
|
||||
# Sourced by resilience.sh — do not invoke directly.
|
||||
|
||||
# Required env (set by resilience.sh before sourcing):
|
||||
# TARGET — ssh target, e.g. archipelago@192.168.1.228
|
||||
# RPC_URL — http://<host>:5678/rpc/v1
|
||||
# COOKIE_JAR — path for curl cookie store
|
||||
# SSH_PASS — sshpass password
|
||||
# UI_PASS — archipelago UI password
|
||||
# OUT_DIR — report output dir
|
||||
|
||||
# ── ssh ─────────────────────────────────────────────────────────
|
||||
ssh_run() {
|
||||
# -n: redirect stdin from /dev/null so ssh doesn't gobble up our parent's
|
||||
# stdin. Without this, ssh inside a `while read … done <<< "$LIST"`
|
||||
# consumes the heredoc on the first call, ending the loop after one
|
||||
# iteration. Cost us a smoke run that only tested filebrowser instead
|
||||
# of all three smoke apps.
|
||||
sshpass -p "$SSH_PASS" ssh -n -o StrictHostKeyChecking=accept-new \
|
||||
-o ConnectTimeout=10 -o LogLevel=ERROR "$TARGET" "$@"
|
||||
}
|
||||
|
||||
# Run a command and tolerate ssh failure (host rebooting, etc.).
|
||||
ssh_try() {
|
||||
sshpass -p "$SSH_PASS" ssh -n -o StrictHostKeyChecking=accept-new \
|
||||
-o ConnectTimeout=5 -o LogLevel=ERROR "$TARGET" "$@" 2>/dev/null || echo "__SSH_FAIL__"
|
||||
}
|
||||
|
||||
ssh_wait_ready() {
|
||||
local deadline=$(($(date +%s) + ${1:-180}))
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
if [ "$(ssh_try 'echo OK')" = "OK" ]; then return 0; fi
|
||||
sleep 3
|
||||
done
|
||||
return 1
|
||||
}
|
||||
|
||||
# ── rpc ─────────────────────────────────────────────────────────
|
||||
rpc_login() {
|
||||
local resp
|
||||
resp=$(curl -ksS -c "$COOKIE_JAR" -H "Content-Type: application/json" \
|
||||
-d "{\"jsonrpc\":\"2.0\",\"method\":\"auth.login\",\"params\":{\"password\":\"$UI_PASS\"},\"id\":1}" \
|
||||
"$RPC_URL")
|
||||
if echo "$resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
echo "ERROR: login failed: $(echo "$resp" | jq -c .)" >&2
|
||||
return 1
|
||||
fi
|
||||
CSRF_TOKEN=$(awk '/csrf_token/ {print $7}' "$COOKIE_JAR" | head -1)
|
||||
[ -n "$CSRF_TOKEN" ] || { echo "ERROR: no CSRF token after login" >&2; return 1; }
|
||||
export CSRF_TOKEN
|
||||
}
|
||||
|
||||
# Make an RPC call. Args: method, json_params, timeout_secs (optional, default 90).
|
||||
# Prints raw JSON response. Caller asserts success via jq.
|
||||
#
|
||||
# CSRF rotates per-response: the server may issue a new csrf_token on every
|
||||
# state-changing call, so we re-read it from the cookie jar before each call
|
||||
# rather than caching the value from login. Also retries once on nginx-served
|
||||
# BACKEND_UNAVAILABLE (5xx fallback) for transient stalls.
|
||||
rpc_call() {
|
||||
local method="$1"
|
||||
# NOTE: don't use ${2:-{}} — bash matches the first unescaped `}` as the
|
||||
# end of the expansion, so the trailing `}` becomes a literal char and
|
||||
# corrupts every params value into invalid JSON. Use an if-check instead.
|
||||
local params="${2-}"
|
||||
[ -z "$params" ] && params='{}'
|
||||
local timeout="${3:-90}"
|
||||
local attempt
|
||||
for attempt in 1 2 3 4; do
|
||||
local csrf
|
||||
csrf=$(awk '/^[^#]/ && /csrf_token/ {print $7; exit}' "$COOKIE_JAR")
|
||||
local resp
|
||||
resp=$(curl -ksS -b "$COOKIE_JAR" -c "$COOKIE_JAR" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "X-CSRF-Token: $csrf" \
|
||||
-d "{\"jsonrpc\":\"2.0\",\"method\":\"$method\",\"params\":$params,\"id\":1}" \
|
||||
--max-time "$timeout" \
|
||||
"$RPC_URL")
|
||||
# Retry on transient errors:
|
||||
# BACKEND_UNAVAILABLE — nginx 5xx fallback (archipelago briefly stalled)
|
||||
# 429 — nginx rate limiter exceeded (burst=40 in /etc/nginx/sites-enabled/*)
|
||||
if echo "$resp" | jq -e '.error.code == "BACKEND_UNAVAILABLE" or .error.code == 429' >/dev/null 2>&1; then
|
||||
[ "$attempt" -eq 4 ] && { echo "$resp"; return; }
|
||||
# Exponential-ish backoff: 5s, 15s, 30s. Plenty of time for the
|
||||
# nginx rate window (1s) and any archipelago restart to clear.
|
||||
sleep $((attempt * 10))
|
||||
continue
|
||||
fi
|
||||
echo "$resp"
|
||||
return
|
||||
done
|
||||
}
|
||||
|
||||
# After a service restart the session may need re-establishing.
|
||||
rpc_relogin_if_needed() {
|
||||
local probe
|
||||
probe=$(rpc_call "package.list" '{}' 2>/dev/null)
|
||||
if echo "$probe" | jq -e '.error.code == -32001' >/dev/null 2>&1; then
|
||||
rpc_login || return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# ── per-app metadata ────────────────────────────────────────────
|
||||
# Mappings the harness needs that aren't expressible from catalog.json alone:
|
||||
# multi-container stack rosters, alias/variant container names (bitcoin-knots
|
||||
# vs bitcoin-core install the same slots), and the actual nginx UI proxy path
|
||||
# (which often differs from /app/<id>/, e.g. `bitcoin-knots` → `/app/bitcoin-ui/`).
|
||||
#
|
||||
# Keep these tables in sync with the install code in package/stacks.rs and
|
||||
# the `*_IMAGE` companion handling in install.rs (the `archy-<x>-ui` set).
|
||||
|
||||
# Containers an app installs. Used for app_already_installed detection AND
|
||||
# for state assertions when the snapshot-diff falls back (variant apps don't
|
||||
# create new containers when their alternate is already present).
|
||||
expected_containers_for() {
|
||||
case "$1" in
|
||||
bitcoin-knots) echo "bitcoin-knots archy-bitcoin-ui" ;;
|
||||
bitcoin-core) echo "bitcoin-core archy-bitcoin-ui" ;;
|
||||
lnd) echo "lnd archy-lnd-ui" ;;
|
||||
electrumx|electrs|mempool-electrs)
|
||||
echo "electrs archy-electrs-ui" ;;
|
||||
btcpay-server) echo "archy-btcpay-server archy-btcpay-db archy-nbxplorer archy-btcpay-ui" ;;
|
||||
mempool) echo "mempool archy-mempool-web archy-mempool-db" ;;
|
||||
immich) echo "immich_server immich_machine_learning immich_postgres immich_redis" ;;
|
||||
penpot|penpot-frontend)
|
||||
echo "penpot-frontend penpot-backend penpot-exporter penpot-postgres penpot-redis" ;;
|
||||
indeedhub) echo "indeedhub indeedhub-api indeedhub-ffmpeg indeedhub-postgres indeedhub-redis indeedhub-minio indeedhub-relay" ;;
|
||||
*) echo "$1" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
# UI proxy URL path on the HTTPS frontend. Most apps live at /app/<id>/ but
|
||||
# Bitcoin/LND/Electrs proxy through their UI companion containers, and BTCPay
|
||||
# uses its own short path.
|
||||
ui_proxy_path_for() {
|
||||
case "$1" in
|
||||
bitcoin-knots|bitcoin-core) echo "/app/bitcoin-ui/" ;;
|
||||
electrumx|electrs) echo "/app/electrs-ui/" ;;
|
||||
lnd) echo "/app/lnd-ui/" ;;
|
||||
btcpay-server) echo "/app/btcpay/" ;;
|
||||
*) echo "/app/$1/" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
# Authenticated probe for credentialed UIs. Echoes the HTTP status code if
|
||||
# defined, otherwise returns 1 (caller records SKIP). PASS = code in
|
||||
# {200,401,403} for endpoints that prove the proxy reaches the backend
|
||||
# (401/403 from app's own auth ≠ 502 from broken proxy).
|
||||
auth_probe_for() {
|
||||
local app="$1"
|
||||
local host; host="$(echo "$TARGET" | cut -d@ -f2)"
|
||||
case "$app" in
|
||||
bitcoin-knots|bitcoin-core)
|
||||
# Direct bitcoin-rpc proxy on :8334 inside .228 — credential
|
||||
# plumbing is the .228 bug we just shipped, must return 200.
|
||||
ssh_run 'curl -s -o /dev/null -w "%{http_code}" --max-time 5 -X POST http://127.0.0.1:8334/bitcoin-rpc/ -H "Content-Type: application/json" -d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"getblockchaininfo\",\"params\":[]}"'
|
||||
;;
|
||||
btcpay-server)
|
||||
# BTCPay's own auth returns 401 for unauthenticated API calls;
|
||||
# 502 means proxy broken / backend down.
|
||||
curl -ks -o /dev/null -w "%{http_code}" --max-time 5 \
|
||||
"https://$host/app/btcpay/api/v1/server/info"
|
||||
;;
|
||||
lnd)
|
||||
# LND has a /lnd-connect-info passthrough on archipelago itself —
|
||||
# returns lndconnect URI when LND is up. 200 = backend reachable.
|
||||
curl -ks -o /dev/null -w "%{http_code}" --max-time 5 \
|
||||
"https://$host/lnd-connect-info"
|
||||
;;
|
||||
electrumx|electrs)
|
||||
# ElectrumX is plain TCP (electrum protocol) — no HTTPS auth path.
|
||||
# archipelago exposes /electrs-status which queries the daemon.
|
||||
curl -ks -o /dev/null -w "%{http_code}" --max-time 5 \
|
||||
"https://$host/electrs-status"
|
||||
;;
|
||||
*)
|
||||
return 1
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
# Whether an auth_probe HTTP code counts as a pass.
|
||||
auth_probe_pass_codes() {
|
||||
case "$1" in
|
||||
bitcoin-knots|bitcoin-core) echo "200" ;;
|
||||
btcpay-server) echo "200 401 403" ;;
|
||||
lnd|electrumx|electrs) echo "200" ;;
|
||||
*) echo "200" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
# ── probes (state assertions) ───────────────────────────────────
|
||||
# Returns container Status string ("running","exited","absent",…).
|
||||
probe_container_state() {
|
||||
local name="$1"
|
||||
ssh_run "podman inspect '$name' --format '{{.State.Status}}' 2>/dev/null || echo absent"
|
||||
}
|
||||
|
||||
# Returns RestartCount as integer.
|
||||
probe_container_restart_count() {
|
||||
local name="$1"
|
||||
ssh_run "podman inspect '$name' --format '{{.RestartCount}}' 2>/dev/null || echo -1"
|
||||
}
|
||||
|
||||
# Probe the app's UI proxy on the HTTPS frontend. Returns HTTP code.
|
||||
# Uses ui_proxy_path_for so apps with non-default proxy paths (bitcoin-ui,
|
||||
# lnd-ui, electrs-ui, btcpay) get probed at the right URL.
|
||||
probe_app_proxy() {
|
||||
local app_id="$1"
|
||||
local host
|
||||
host="$(echo "$TARGET" | cut -d@ -f2)"
|
||||
local path
|
||||
path=$(ui_proxy_path_for "$app_id")
|
||||
curl -ks -o /dev/null -w "%{http_code}" --max-time 5 "https://$host$path" || echo "000"
|
||||
}
|
||||
|
||||
# Check that ZERO containers are leftover for this app — catches uninstall residue.
|
||||
probe_no_residue() {
|
||||
local prefix="$1"
|
||||
ssh_run "podman ps -a --format '{{.Names}}' | grep -E '^${prefix}(-|$)' | wc -l"
|
||||
}
|
||||
|
||||
# ── waiters ─────────────────────────────────────────────────────
|
||||
# Wait for the package's state in the RPC list to match expected, with timeout.
|
||||
wait_for_package_state() {
|
||||
local pkg="$1"; local want="$2"; local timeout="${3:-300}"
|
||||
local deadline=$(($(date +%s) + timeout))
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
local got
|
||||
got=$(rpc_call "package.list" '{}' \
|
||||
| jq -r ".result.package_data[\"$pkg\"].state // \"absent\"")
|
||||
case "$want" in
|
||||
Running) [ "$got" = "Running" ] && return 0 ;;
|
||||
Stopped) [ "$got" = "Stopped" ] && return 0 ;;
|
||||
absent) [ "$got" = "absent" ] && return 0 ;;
|
||||
esac
|
||||
sleep 4
|
||||
done
|
||||
echo "TIMEOUT waiting for $pkg → $want (last seen: $got)" >&2
|
||||
return 1
|
||||
}
|
||||
|
||||
# Wait for podman state of a specific container.
|
||||
wait_for_container_state() {
|
||||
local name="$1"; local want="$2"; local timeout="${3:-180}"
|
||||
local deadline=$(($(date +%s) + timeout))
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
local got
|
||||
got=$(probe_container_state "$name")
|
||||
[ "$got" = "$want" ] && return 0
|
||||
sleep 3
|
||||
done
|
||||
echo "TIMEOUT waiting for container $name → $want (last seen: $got)" >&2
|
||||
return 1
|
||||
}
|
||||
|
||||
# Wait until restart count is stable for `stable_secs` seconds — proxy for "no crashloop".
|
||||
wait_restart_count_stable() {
|
||||
local name="$1"; local stable_secs="${2:-30}"; local timeout="${3:-180}"
|
||||
local deadline=$(($(date +%s) + timeout))
|
||||
local last; local last_change_ts
|
||||
last=$(probe_container_restart_count "$name")
|
||||
last_change_ts=$(date +%s)
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
sleep 5
|
||||
local now
|
||||
now=$(probe_container_restart_count "$name")
|
||||
if [ "$now" != "$last" ]; then
|
||||
last="$now"
|
||||
last_change_ts=$(date +%s)
|
||||
elif [ $(( $(date +%s) - last_change_ts )) -ge "$stable_secs" ]; then
|
||||
return 0
|
||||
fi
|
||||
done
|
||||
echo "TIMEOUT waiting for $name restart-count stable (last=$last)" >&2
|
||||
return 1
|
||||
}
|
||||
|
||||
# ── result recording ────────────────────────────────────────────
|
||||
# Append a result row to the JSON-lines report.
|
||||
# Args: app_id, transition, status (PASS/FAIL/SKIP), detail
|
||||
record() {
|
||||
local app="$1"; local transition="$2"; local status="$3"; local detail="${4:-}"
|
||||
local ts
|
||||
ts=$(date -u +%Y-%m-%dT%H:%M:%SZ)
|
||||
jq -nc --arg ts "$ts" --arg app "$app" --arg t "$transition" --arg s "$status" --arg d "$detail" \
|
||||
'{ts:$ts, app:$app, transition:$t, status:$s, detail:$d}' >> "$OUT_DIR/results.jsonl"
|
||||
local marker
|
||||
case "$status" in
|
||||
PASS) marker="✅" ;;
|
||||
FAIL) marker="❌" ;;
|
||||
SKIP) marker="⏭" ;;
|
||||
*) marker="•" ;;
|
||||
esac
|
||||
printf '%s [%-15s] %-30s %s%s\n' "$marker" "$app" "$transition" "$status" "${detail:+ — $detail}"
|
||||
}
|
||||
473
scripts/resilience/resilience.sh
Executable file
473
scripts/resilience/resilience.sh
Executable file
@@ -0,0 +1,473 @@
|
||||
#!/bin/bash
|
||||
# Archipelago resilience harness — black-box state-machine tester for app containers.
|
||||
#
|
||||
# Drives the live archipelago RPC against a real podman runtime on a target
|
||||
# host. For each app in the catalog, runs every state transition a user could
|
||||
# trigger (install / probe / stop / start / restart / archipelago-restart /
|
||||
# host-reboot / uninstall / reinstall / vanish-watch) and asserts the system
|
||||
# remains in the expected state at every step.
|
||||
#
|
||||
# Usage:
|
||||
# scripts/resilience/resilience.sh archipelago@192.168.1.228 [filter]
|
||||
#
|
||||
# `filter` is a comma-separated list of app IDs (or "smoke" for the curated
|
||||
# fast subset). Default: every app in app-catalog/catalog.json.
|
||||
#
|
||||
# Exit codes:
|
||||
# 0 every cell green
|
||||
# 1 any cell red — release should not ship
|
||||
# 2 setup/auth error before tests began
|
||||
|
||||
set -uo pipefail
|
||||
|
||||
# ── args ─────────────────────────────────────────────────────────
|
||||
TARGET="${1:?usage: $0 <user@host> [filter]}"
|
||||
FILTER="${2:-}"
|
||||
|
||||
ROOT="$(cd "$(dirname "$0")/../.." && pwd)"
|
||||
HERE="$ROOT/scripts/resilience"
|
||||
RUN_TS="$(date -u +%Y%m%dT%H%M%SZ)"
|
||||
OUT_DIR="$HERE/reports/$RUN_TS"
|
||||
mkdir -p "$OUT_DIR"
|
||||
COOKIE_JAR="$OUT_DIR/cookies.txt"
|
||||
|
||||
HOST="$(echo "$TARGET" | cut -d@ -f2)"
|
||||
# RPC reaches archipelago through nginx on 443 (which proxies to localhost:5678).
|
||||
# Direct :5678 is bound to 127.0.0.1 on the target so we can't curl it from here.
|
||||
RPC_URL="https://$HOST/rpc/v1"
|
||||
|
||||
export TARGET RPC_URL COOKIE_JAR OUT_DIR
|
||||
|
||||
# shellcheck source=lib.sh
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
# ── credentials ──────────────────────────────────────────────────
|
||||
# Pull from env first (so this script can be called from CI). Fall back to
|
||||
# interactive prompts.
|
||||
SSH_PASS="${RESILIENCE_SSH_PASS:-}"
|
||||
UI_PASS="${RESILIENCE_UI_PASS:-}"
|
||||
if [ -z "$SSH_PASS" ]; then
|
||||
read -rsp "SSH password for $TARGET: " SSH_PASS; echo
|
||||
fi
|
||||
if [ -z "$UI_PASS" ]; then
|
||||
read -rsp "Archipelago UI password: " UI_PASS; echo
|
||||
fi
|
||||
export SSH_PASS UI_PASS
|
||||
|
||||
command -v sshpass >/dev/null || { echo "sshpass required"; exit 2; }
|
||||
command -v jq >/dev/null || { echo "jq required"; exit 2; }
|
||||
|
||||
ssh_run 'echo ok' >/dev/null || { echo "ssh to $TARGET failed"; exit 2; }
|
||||
rpc_login || exit 2
|
||||
|
||||
echo "Resilience harness — target $TARGET, run $RUN_TS"
|
||||
echo "Output: $OUT_DIR/results.jsonl"
|
||||
echo "─────────────────────────────────────────────────────────────"
|
||||
|
||||
# ── catalog & filter ─────────────────────────────────────────────
|
||||
CATALOG="$ROOT/app-catalog/catalog.json"
|
||||
ALL_APPS=$(jq -r '.apps[].id' "$CATALOG")
|
||||
|
||||
# Topo-sort the catalog by `requires`. Outputs app IDs in install order
|
||||
# (deps first, then dependents). Kahn's algorithm via python — keeps the
|
||||
# bash side simple and the deps logic obvious for next-time-readers.
|
||||
topo_order() {
|
||||
python3 -c "
|
||||
import json
|
||||
with open('$CATALOG') as f: c = json.load(f)
|
||||
deps = {a['id']: list(a.get('requires', [])) for a in c['apps']}
|
||||
order = []
|
||||
remaining = set(deps)
|
||||
while remaining:
|
||||
ready = sorted(a for a in remaining if all(d not in remaining for d in deps[a]))
|
||||
if not ready: # cycle (shouldn't happen) — emit whatever's left
|
||||
order.extend(sorted(remaining)); break
|
||||
order.extend(ready); remaining.difference_update(ready)
|
||||
print('\n'.join(order))
|
||||
"
|
||||
}
|
||||
|
||||
apps_to_test() {
|
||||
local order; order=$(topo_order)
|
||||
if [ -z "$FILTER" ]; then
|
||||
# Full sweep — but skip bitcoin-core since it shares container slots
|
||||
# with bitcoin-knots; testing both back-to-back would just churn the
|
||||
# same containers. bitcoin-knots is the canonical entry.
|
||||
echo "$order" | grep -v '^bitcoin-core$'
|
||||
elif [ "$FILTER" = "smoke" ]; then
|
||||
# Fast subset exercising the bug classes we just fixed:
|
||||
# single-container, multi-container stack, credentialed UI.
|
||||
echo -e "filebrowser\nbitcoin-knots\nindeedhub"
|
||||
else
|
||||
echo "$order" | grep -E "^($(echo "$FILTER" | tr ',' '|'))$"
|
||||
fi
|
||||
}
|
||||
|
||||
# Resolve `requires` chain for $1 in install-order (deps first).
|
||||
deps_for_app() {
|
||||
local app="$1"
|
||||
python3 -c "
|
||||
import json
|
||||
with open('$CATALOG') as f: c = json.load(f)
|
||||
deps_map = {a['id']: list(a.get('requires', [])) for a in c['apps']}
|
||||
visited, order = set(), []
|
||||
def visit(x):
|
||||
if x in visited or x not in deps_map: return
|
||||
visited.add(x)
|
||||
for d in deps_map.get(x, []): visit(d)
|
||||
order.append(x)
|
||||
for d in deps_map.get('$app', []): visit(d)
|
||||
print('\n'.join(order))
|
||||
"
|
||||
}
|
||||
|
||||
# ── per-app transitions ──────────────────────────────────────────
|
||||
# Diff helper: capture container names matching a sane prefix for $app_id.
|
||||
# Approach: snapshot before install, snapshot after, take the difference =
|
||||
# this app's containers.
|
||||
snapshot_containers() {
|
||||
ssh_run "podman ps -a --format '{{.Names}}' | sort"
|
||||
}
|
||||
|
||||
# Whether $app currently has any of its expected containers running. Uses
|
||||
# the per-app metadata table in lib.sh (expected_containers_for) so variant
|
||||
# apps (bitcoin-knots/bitcoin-core sharing slots) and stacks are detected
|
||||
# correctly. Falls back to name-prefix match for apps the table doesn't know.
|
||||
app_already_installed() {
|
||||
local app="$1"
|
||||
local snap; snap=$(snapshot_containers)
|
||||
local expected
|
||||
expected=$(expected_containers_for "$app")
|
||||
local c
|
||||
for c in $expected; do
|
||||
echo "$snap" | grep -qxF "$c" && return 0
|
||||
done
|
||||
# Generic prefix fallback for apps not in the expected_containers_for table.
|
||||
echo "$snap" | grep -qE "^(${app}|${app}-|archy-${app}|archy-${app}-)"
|
||||
}
|
||||
|
||||
# Install missing deps for $app via the regular install path. Idempotent —
|
||||
# already-installed deps are skipped. Records dep_install per dep so we can
|
||||
# tell from the report whether the bitcoin pre-req was actually green by the
|
||||
# time lnd's matrix started.
|
||||
ensure_deps_installed() {
|
||||
local app="$1"
|
||||
local dep
|
||||
for dep in $(deps_for_app "$app"); do
|
||||
if app_already_installed "$dep"; then
|
||||
continue
|
||||
fi
|
||||
echo " · dep install: $dep (required by $app)"
|
||||
local img ver resp
|
||||
img=$(jq -r --arg id "$dep" '.apps[] | select(.id==$id) | .dockerImage // ""' "$CATALOG")
|
||||
ver=$(jq -r --arg id "$dep" '.apps[] | select(.id==$id) | .version // ""' "$CATALOG")
|
||||
if [ -z "$img" ]; then
|
||||
record "$app" "dep_$dep" FAIL "no dockerImage in catalog for dep $dep"
|
||||
return 1
|
||||
fi
|
||||
resp=$(rpc_call "package.install" "$(jq -nc \
|
||||
--arg id "$dep" --arg img "$img" --arg ver "$ver" \
|
||||
'{id:$id, dockerImage:$img, version:$ver}')")
|
||||
if echo "$resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
record "$app" "dep_$dep" FAIL "rpc error: $(echo "$resp" | jq -c '.error')"
|
||||
return 1
|
||||
fi
|
||||
# Wait for at least one expected container to appear running.
|
||||
local deadline=$(($(date +%s) + 600))
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
if app_already_installed "$dep"; then
|
||||
record "$app" "dep_$dep" PASS "installed"
|
||||
break
|
||||
fi
|
||||
sleep 5
|
||||
done
|
||||
if ! app_already_installed "$dep"; then
|
||||
record "$app" "dep_$dep" FAIL "containers did not appear within 10min"
|
||||
return 1
|
||||
fi
|
||||
done
|
||||
return 0
|
||||
}
|
||||
|
||||
# Pre-clean: if the app is currently installed, uninstall it and wait for
|
||||
# all containers to disappear. We can't measure install correctness without
|
||||
# starting from a clean slate. Fail-soft — if the uninstall RPC errors we
|
||||
# log but proceed; the install step will catch any residual state.
|
||||
preclean_app() {
|
||||
local app="$1"
|
||||
if ! app_already_installed "$app"; then
|
||||
return 0
|
||||
fi
|
||||
echo " · pre-clean: $app already installed, uninstalling first"
|
||||
local resp; resp=$(rpc_call "package.uninstall" "{\"id\":\"$app\"}")
|
||||
if echo "$resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
echo " pre-clean uninstall RPC error: $(echo "$resp" | jq -c '.error')"
|
||||
fi
|
||||
# Multi-container stacks (indeedhub: 7, immich: 5, mempool: 3, btcpay: 6)
|
||||
# take noticeably longer to tear down than single-container apps. 240s was
|
||||
# too tight for indeedhub's 7-container teardown — bump to 10 min for
|
||||
# safety; per-container timeout is still bounded inside archipelago itself.
|
||||
local deadline=$(($(date +%s) + 600))
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
if ! app_already_installed "$app"; then return 0; fi
|
||||
sleep 5
|
||||
done
|
||||
echo " pre-clean: timeout waiting for $app to uninstall"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Run the full per-app matrix. Records a row per transition.
|
||||
run_app_matrix() {
|
||||
local app="$1"
|
||||
echo
|
||||
echo "═══ $app ═══"
|
||||
|
||||
if ! ensure_deps_installed "$app"; then
|
||||
record "$app" install FAIL "dep install failed; skipping rest of matrix"
|
||||
return
|
||||
fi
|
||||
preclean_app "$app" || record "$app" preclean FAIL "uninstall before test did not complete"
|
||||
|
||||
# ── 01 install ───────────────────────────────────────────────
|
||||
local before after new_containers
|
||||
before=$(snapshot_containers)
|
||||
# The install handler requires `id` + `dockerImage` from the catalog
|
||||
# entry. Match what the UI passes (Discover.vue / MarketplaceAppDetails.vue).
|
||||
local docker_image version
|
||||
docker_image=$(jq -r --arg id "$app" '.apps[] | select(.id==$id) | .dockerImage // ""' "$CATALOG")
|
||||
version=$(jq -r --arg id "$app" '.apps[] | select(.id==$id) | .version // ""' "$CATALOG")
|
||||
if [ -z "$docker_image" ]; then
|
||||
record "$app" install FAIL "no dockerImage in catalog for $app"
|
||||
return
|
||||
fi
|
||||
local install_resp
|
||||
install_resp=$(rpc_call "package.install" "$(jq -nc \
|
||||
--arg id "$app" --arg img "$docker_image" --arg ver "$version" \
|
||||
'{id:$id, dockerImage:$img, version:$ver}')")
|
||||
if echo "$install_resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
record "$app" install FAIL "rpc error: $(echo "$install_resp" | jq -c '.error')"
|
||||
return # cannot continue this app
|
||||
fi
|
||||
|
||||
# Wait for the EXPECTED containers (per expected_containers_for) to all
|
||||
# appear. The old "snapshot stable for 10s + count > before" heuristic
|
||||
# terminated early on apps with deps: e.g. mempool's wait would break
|
||||
# when archy-electrs-ui (electrumx dep companion) appeared, long before
|
||||
# mempool's own containers were created (those take ~10min to pull and
|
||||
# start). Waiting on the expected-set is exact, not heuristic.
|
||||
#
|
||||
# Cap at 15 minutes — mempool stack with cold image cache needs ~12 min.
|
||||
local expected; expected=$(expected_containers_for "$app")
|
||||
local deadline=$(($(date +%s) + 900))
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
after=$(snapshot_containers)
|
||||
local missing=0
|
||||
for c in $expected; do
|
||||
echo "$after" | grep -qxF "$c" || missing=1
|
||||
done
|
||||
[ "$missing" -eq 0 ] && break
|
||||
sleep 5
|
||||
done
|
||||
new_containers=$(comm -13 <(echo "$before") <(echo "$after"))
|
||||
if [ -z "$new_containers" ]; then
|
||||
record "$app" install FAIL "no containers created within 10min"
|
||||
return
|
||||
fi
|
||||
# Assert each new container is in 'running' state.
|
||||
local install_ok=1; local detail=""
|
||||
while read -r c; do
|
||||
[ -z "$c" ] && continue
|
||||
local s
|
||||
s=$(probe_container_state "$c")
|
||||
if [ "$s" != "running" ]; then
|
||||
install_ok=0
|
||||
detail="$detail $c=$s"
|
||||
fi
|
||||
done <<< "$new_containers"
|
||||
if [ "$install_ok" -eq 1 ]; then
|
||||
record "$app" install PASS "$(echo "$new_containers" | tr '\n' ',' | sed 's/,$//')"
|
||||
else
|
||||
record "$app" install FAIL "containers not running:$detail"
|
||||
fi
|
||||
|
||||
# ── 02 ui_probe ──────────────────────────────────────────────
|
||||
local code
|
||||
code=$(probe_app_proxy "$app")
|
||||
# Accept all 2xx/3xx — proxy reaches backend, app may redirect to login,
|
||||
# serve OAuth flow (307), or use 308 permanent. 401/403 still fail because
|
||||
# those mean "backend reached, app rejected request" which is the
|
||||
# credential-plumbing failure mode we DO want to catch.
|
||||
if [[ "$code" =~ ^(2[0-9][0-9]|3[0-9][0-9])$ ]]; then
|
||||
record "$app" ui_probe PASS "HTTP $code"
|
||||
else
|
||||
record "$app" ui_probe FAIL "HTTP $code (expected 2xx/3xx)"
|
||||
fi
|
||||
|
||||
# ── 03 auth_probe (only for apps with a credentialed/data endpoint) ──
|
||||
local probe_code; local pass_codes
|
||||
if probe_code=$(auth_probe_for "$app" 2>/dev/null) && [ -n "$probe_code" ]; then
|
||||
pass_codes=$(auth_probe_pass_codes "$app")
|
||||
if echo " $pass_codes " | grep -qF " $probe_code "; then
|
||||
record "$app" auth_probe PASS "HTTP $probe_code"
|
||||
else
|
||||
record "$app" auth_probe FAIL "HTTP $probe_code (expected one of: $pass_codes — credential plumbing broken)"
|
||||
fi
|
||||
else
|
||||
record "$app" auth_probe SKIP "no authenticated probe defined"
|
||||
fi
|
||||
|
||||
# ── 04 stop ──────────────────────────────────────────────────
|
||||
local stop_resp
|
||||
stop_resp=$(rpc_call "package.stop" "{\"id\":\"$app\"}")
|
||||
if echo "$stop_resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
record "$app" stop FAIL "rpc error: $(echo "$stop_resp" | jq -c '.error')"
|
||||
else
|
||||
local all_stopped=1
|
||||
while read -r c; do
|
||||
[ -z "$c" ] && continue
|
||||
wait_for_container_state "$c" "exited" 60 || all_stopped=0
|
||||
done <<< "$new_containers"
|
||||
if [ "$all_stopped" -eq 1 ]; then
|
||||
record "$app" stop PASS
|
||||
else
|
||||
record "$app" stop FAIL "not all containers reached exited state"
|
||||
fi
|
||||
fi
|
||||
|
||||
# ── 05 start ─────────────────────────────────────────────────
|
||||
local start_resp
|
||||
start_resp=$(rpc_call "package.start" "{\"id\":\"$app\"}")
|
||||
if echo "$start_resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
record "$app" start FAIL "rpc error: $(echo "$start_resp" | jq -c '.error')"
|
||||
else
|
||||
local all_started=1
|
||||
while read -r c; do
|
||||
[ -z "$c" ] && continue
|
||||
wait_for_container_state "$c" "running" 90 || all_started=0
|
||||
done <<< "$new_containers"
|
||||
if [ "$all_started" -eq 1 ]; then
|
||||
record "$app" start PASS
|
||||
else
|
||||
record "$app" start FAIL "not all containers reached running state"
|
||||
fi
|
||||
fi
|
||||
|
||||
# ── 06 restart_container ─────────────────────────────────────
|
||||
# `package.restart` returns immediately and spawns the actual restart.
|
||||
# `podman restart -t <stop_timeout>` blocks for up to stop_timeout
|
||||
# seconds (e.g. 600s for bitcoin-core). Polling once after sleep 5
|
||||
# races on slow-stopping apps and false-positive-FAILs them. Poll
|
||||
# each container up to 90s for "running" instead.
|
||||
local restart_resp
|
||||
restart_resp=$(rpc_call "package.restart" "{\"id\":\"$app\"}")
|
||||
if echo "$restart_resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
record "$app" restart FAIL "rpc error: $(echo "$restart_resp" | jq -c '.error')"
|
||||
else
|
||||
local all_running=1
|
||||
while read -r c; do
|
||||
[ -z "$c" ] && continue
|
||||
wait_for_container_state "$c" "running" 90 || all_running=0
|
||||
done <<< "$new_containers"
|
||||
if [ "$all_running" -eq 1 ]; then
|
||||
record "$app" restart PASS
|
||||
else
|
||||
record "$app" restart FAIL "container not running 90s after restart"
|
||||
fi
|
||||
fi
|
||||
|
||||
# ── 09 uninstall (skip 07 archipelago-restart and 08 host-reboot
|
||||
# here — those are batch tests run once across all installed apps) ─
|
||||
local uninst_resp
|
||||
uninst_resp=$(rpc_call "package.uninstall" "{\"id\":\"$app\"}")
|
||||
if echo "$uninst_resp" | jq -e '.error' >/dev/null 2>&1; then
|
||||
record "$app" uninstall FAIL "rpc error: $(echo "$uninst_resp" | jq -c '.error')"
|
||||
else
|
||||
# Wait for all this-app containers to be absent.
|
||||
local all_gone=1
|
||||
while read -r c; do
|
||||
[ -z "$c" ] && continue
|
||||
wait_for_container_state "$c" "absent" 120 || all_gone=0
|
||||
done <<< "$new_containers"
|
||||
if [ "$all_gone" -eq 1 ]; then
|
||||
record "$app" uninstall PASS
|
||||
else
|
||||
record "$app" uninstall FAIL "not all containers removed"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
# ── batch transitions (run after per-app loop) ───────────────────
|
||||
batch_archipelago_service_restart() {
|
||||
echo
|
||||
echo "═══ batch: archipelago.service restart ═══"
|
||||
local before; before=$(snapshot_containers)
|
||||
if ! ssh_run 'sudo systemctl restart archipelago'; then
|
||||
record "_batch" archipelago_restart FAIL "systemctl restart errored"
|
||||
return
|
||||
fi
|
||||
ssh_wait_ready 60 || { record "_batch" archipelago_restart FAIL "ssh did not return"; return; }
|
||||
sleep 30 # let containers re-stabilize
|
||||
rpc_login || { record "_batch" archipelago_restart FAIL "rpc relogin failed"; return; }
|
||||
local after; after=$(snapshot_containers)
|
||||
if [ "$before" = "$after" ]; then
|
||||
record "_batch" archipelago_restart PASS "container set unchanged"
|
||||
else
|
||||
record "_batch" archipelago_restart FAIL "container set drifted across restart"
|
||||
fi
|
||||
}
|
||||
|
||||
batch_host_reboot() {
|
||||
echo
|
||||
echo "═══ batch: host reboot ═══"
|
||||
local before; before=$(snapshot_containers)
|
||||
ssh_run 'sudo systemctl reboot' || true # ssh disconnects immediately
|
||||
sleep 30
|
||||
# 5 min was too short — .228 took ~9min for full BIOS+kernel+systemd+
|
||||
# rootless-podman boot. 12 min gives margin for slower hardware.
|
||||
ssh_wait_ready 720 || { record "_batch" host_reboot FAIL "host did not come back in 12min"; return; }
|
||||
sleep 60 # let containers auto-restart
|
||||
rpc_login || { record "_batch" host_reboot FAIL "rpc unreachable after reboot"; return; }
|
||||
local after; after=$(snapshot_containers)
|
||||
if [ "$before" = "$after" ]; then
|
||||
record "_batch" host_reboot PASS "all containers came back"
|
||||
else
|
||||
local missing
|
||||
missing=$(comm -23 <(echo "$before") <(echo "$after") | tr '\n' ',' | sed 's/,$//')
|
||||
record "_batch" host_reboot FAIL "missing: $missing"
|
||||
fi
|
||||
}
|
||||
|
||||
# ── main ─────────────────────────────────────────────────────────
|
||||
APPS_LIST=$(apps_to_test)
|
||||
if [ -z "$APPS_LIST" ]; then
|
||||
echo "no apps match filter '$FILTER'" >&2; exit 2
|
||||
fi
|
||||
|
||||
while read -r app; do
|
||||
[ -z "$app" ] && continue
|
||||
run_app_matrix "$app"
|
||||
done <<< "$APPS_LIST"
|
||||
|
||||
# Batch transitions only run on full sweep (skip in filtered/smoke mode).
|
||||
if [ -z "$FILTER" ]; then
|
||||
batch_archipelago_service_restart
|
||||
batch_host_reboot
|
||||
fi
|
||||
|
||||
# ── summary ──────────────────────────────────────────────────────
|
||||
echo
|
||||
echo "═══ summary ═══"
|
||||
count_status() {
|
||||
local pat="$1"
|
||||
[ -s "$OUT_DIR/results.jsonl" ] || { echo 0; return; }
|
||||
awk -v pat="$pat" '$0 ~ pat { n++ } END { print n+0 }' "$OUT_DIR/results.jsonl"
|
||||
}
|
||||
PASS=$(count_status '"status":"PASS"')
|
||||
FAIL=$(count_status '"status":"FAIL"')
|
||||
SKIP=$(count_status '"status":"SKIP"')
|
||||
TOTAL=$((PASS + FAIL + SKIP))
|
||||
echo "PASS: $PASS / FAIL: $FAIL / SKIP: $SKIP / TOTAL: $TOTAL"
|
||||
echo "Report: $OUT_DIR/results.jsonl"
|
||||
|
||||
[ "$FAIL" -eq 0 ] || exit 1
|
||||
exit 0
|
||||
Reference in New Issue
Block a user