Compare commits

...

69 Commits

Author SHA1 Message Date
archipelago
1f86f2e937 chore: release v1.7.47-alpha
Sync-perf tuning for bitcoin/bitcoin-core/bitcoin-knots/electrumx.

- Drop the --cpus=2 cap on bitcoin/electrumx variants. Script verification
  is parallelizable; the cap halved IBD speed on 4-8 core machines.
- Bump bitcoin --memory 4g→8g so dbcache=4096 has headroom for mempool +
  connection buffers + I/O. 4g was OOM-prone during heavy IBD.
- Bump electrumx --memory 1g→2g + add CACHE_MB=2048 + MAX_SEND=10MB.
- bitcoin-core CLI args gain -dbcache=4096 -par=0 -maxconnections=125.
- bitcoin-knots manifest matched (1024MB pruned / 4096MB full + par=0).

Future v2: host-RAM-aware dbcache scaling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:47:51 -04:00
archipelago
03b7966c38 chore: release v1.7.46-alpha
Follow-up to v1.7.45-alpha closing the remaining tasks identified by the
resilience sweeps + the new bitcoin orphan / install-fail-vanish bugs.

User-visible:
- Health monitor: stop paging on orphaned containers from variant switches
- Install fail: card stays visible (was vanishing) with error message
- Stack pull progress: interpolate 20→70% (was stuck at 20%)
- docker.io → lfg2025 mirror: bitcoin/gitea/nextcloud/valkey

Internal:
- Resilience harness — install-wait uses expected_containers_for, ui+auth
  probes retry with 60s backoff, dep-snapshot fix
- InstallProgress gains optional `message` field (frontend renders it
  when phase is None)

binary  $(stat -c %s releases/v1.7.46-alpha/archipelago)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago | awk '{print $1}')
tarball $(stat -c %s releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz | awk '{print $1}')

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:50:33 -04:00
archipelago
68245142b5 release: v1.7.45-alpha artifacts and manifest
binary  41,618,344 bytes  sha256:ca1958b0f420cc6e73aa4bc161e20ebe7750e933888368394ad17a3f3a36cfad
tarball 77,025,110 bytes  sha256:59d538768e92a1cd726afd272838dbdd581c87780140792b2818434ef2ae7b81

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:43:22 -04:00
archipelago
dacdab9f6e chore: release v1.7.45-alpha
Resilience-validated release. Three full sweeps of the new resilience
harness against .228 confirm no shipstoppers.

Big user-visible:
- Bitcoin RPC auth durably correct via host-rendered nginx.conf bind-mount,
  replaces fragile post-start exec that failed under restricted-cap rootless
  podman ("crun: write cgroup.procs: Permission denied")
- Multi-container stack installs (indeedhub, immich, btcpay, mempool) now
  emit phase events at every boundary so the progress bar advances
- Apps no longer vanish from the dashboard mid-install (absent-scanner skips
  packages in transitional states)
- Indeedhub fresh installs work end-to-end (was 8500+ restart loop): five
  missing env vars (DATABASE_PORT, QUEUE_HOST, QUEUE_PORT,
  S3_PRIVATE_BUCKET_NAME, AES_MASTER_SECRET) added to install code
- Tailscale install fixed: --entrypoint string was being passed as a single
  shell-line arg; switched to custom_args array
- Catalog cleaned of broken entries (dwn, endurain, ollama removed; nextcloud
  restored on docker.io)
- Bitcoin Core update path uses correct image (was looking for nonexistent
  lfg2025/bitcoin:28.4)
- ISO installs now allocate swap on the encrypted data partition

Infra:
- New resilience harness (scripts/resilience/) — black-box state-machine
  tester, every app × every transition. Run before each release.

Sweep #3 final: PASS 107 / FAIL 12 / SKIP 14. The 12 fails are 1 cosmetic
(homeassistant trusted_hosts), 8 harness/timing false-positives, and 3
non-shipstopper tracked items. Down from 23 in baseline sweep #1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:31:45 -04:00
archipelago
6c970dc969 chore: release v1.7.44-alpha 2026-04-28 15:03:04 -04:00
archipelago
43de3b73b2 feat(orchestrator): complete container migration and release hardening 2026-04-28 15:00:58 -04:00
archipelago
ce39430b33 feat(self-update): sync and rebuild UI containers on OTA
self-update.sh previously rebuilt only the backend binary and Vue
frontend. The custom UI containers (archy-bitcoin-ui, archy-lnd-ui,
archy-electrs-ui) were left untouched forever. That meant any change to
docker/<ui>/{Dockerfile, nginx.conf, index.html, ...} never reached a
running node through OTA; it required a manual SSH + rebuild. This is
exactly why the lnd-ui port fix didnt reach .228 in v1.7.43-alpha.

Add a sync-and-rebuild stage:

  1. Hash each docker/<ui>/ tree (content-only, path-stable via
     `cd && find` so src and dst compare equal when identical).
  2. rsync changed trees to /opt/archipelago/docker/<ui>/.
  3. For each changed UI: rebuild image as the archipelago user
     (rootless podman), then stop+remove+recreate the container using
     the canonical spec from scripts/container-specs.sh. Port mappings,
     caps, memory, and security opts all come from the spec, so the
     runtime cant drift from the tree.

Also install first-boot-containers.sh into /opt/archipelago/scripts/ so
a later reconciler run or reboot picks up current orchestration logic.

Idempotent: if no UI tree changed since the last update, the whole stage
is a no-op beyond the hash compare. Verified end-to-end on .228 with a
synthetic change to lnd-ui: detection, sync, build, recreate, and HTTP
200 on both the direct container port and the host-nginx /app/lnd/
proxy.
2026-04-23 15:48:53 -04:00
archipelago
72dec5aaa5 fix(lnd-ui): align container port across all specs
The LND UI container was unreachable on .228 after the v1.7.43-alpha
deploy because three sources of truth disagreed on which port nginx
listens on inside the container:

  - docker/lnd-ui/nginx.conf        listen 8081
  - docker/lnd-ui/Dockerfile        EXPOSE 8080
  - apps/lnd-ui/manifest.yml        host networking, ports: []
  - scripts/first-boot-containers.sh  -p 8081:8080
  - scripts/deploy-to-target.sh        -p 8081:80     (de-facto)
  - scripts/deploy-tailscale.sh        -p 8081:80
  - scripts/container-specs.sh        SPEC_PORTS=8081:80

Result: podman published host 8081 to container port 80, but no one was
listening on 80 inside, so connections were reset. Canonicalize on
container:80 with host:8081 publish, matching the three deploy paths
already in agreement.

Changes:
  - docker/lnd-ui/nginx.conf: listen 8081 -> listen 80
  - docker/lnd-ui/Dockerfile: EXPOSE 8080 -> EXPOSE 80
  - apps/lnd-ui/manifest.yml: replace host-network (never true) with
    bridge networking and explicit 8081:80 port mapping, correcting a
    documentation-vs-reality mismatch
  - scripts/first-boot-containers.sh: -p 8081:8080 -> -p 8081:80, and
    fix the internal-port comment

Verified on .228 after rebuild: curl http://127.0.0.1:8081/ returns HTTP
200 and the /app/lnd/ host-nginx proxy resolves cleanly.
2026-04-23 15:42:49 -04:00
archipelago
83aacdf209 chore(release): archive ISO build recipes, tarball-only releases
Releases no longer ship as bootable ISOs. Archipelago updates are
distributed as the backend binary plus a frontend tarball referenced by
releases/manifest.json. Nodes OTA-update via scripts/self-update.sh.

Filebrowser and AIUI remain bundled inside the frontend tarball and
deployed atomically, verified present in v1.7.43-alpha release artifact
(189 AIUI files, filebrowser-client bundle).

Archived under image-recipe/_archived/ (resurrectable if ISO distribution
is reintroduced):
  - build-auto-installer-iso.sh
  - build-unbundled-iso.sh
  - test-iso-qemu.sh
  - scripts/convert-iso-to-disk.sh
  - BUILD-ISO-STATUS.md, ISO-BUILD-CHECKLIST.md
  - branding/isohdpfx.bin
  - .gitea/workflows/build-iso-dev.yml

Updated release process docs to drop ISO references:
  - scripts/create-release.sh (next-steps text)
  - docs/BETA-RELEASE-CHECKLIST.md
  - docs/hotfix-process.md
  - README.md
2026-04-23 15:36:00 -04:00
archipelago
4ece2c1e7e release: v1.7.43-alpha artifacts and manifest
Backend binary, frontend tarball (with AIUI bundled), and updated
manifest.json pointing fleet updaters at the new download URLs.
2026-04-23 13:25:21 -04:00
archipelago
a672f45b00 docs(release-notes): v1.7.43-alpha bullet for AIUI preservation fix 2026-04-23 13:22:28 -04:00
archipelago
a76e7604a0 chore(release): bump version to 1.7.43-alpha 2026-04-23 13:21:58 -04:00
archipelago
84c2c2880a fix(aiui): bundle demo/aiui in self-update and ISO builds so updates never wipe it
Every OTA self-update and every ISO capture was implicitly relying on
/opt/archipelago/web-ui/aiui/ already being present on disk. Any node that
had its web-ui directory atomically swapped (for example by a manual
deployment shipping only neode-ui dist output) lost aiui entirely and the
AI Assistant tab fell through to the "needs to be enabled" placeholder.

self-update.sh: drop the rsync --exclude aiui preservation trick and
instead stage demo/aiui into the freshly-built dist tree before rsync.
demo/aiui in the repo is now the source of truth; every update overwrites
the on-disk copy with a matching version rather than carrying forward
whatever stale bundle happened to survive.

build-auto-installer-iso.sh: prepend demo/aiui to the AIUI search list so
ISO builds from a fresh repo clone pick it up automatically, without
requiring a side-checkout of the AIUI project or a live dev server.

This matches create-release-manifest.sh which already bakes demo/aiui
into the release tarball (lines 86-89).
2026-04-23 13:21:49 -04:00
archipelago
8034d382ee docs(release-notes): v1.7.43-alpha bullets for chunking, avatar, outbox, parser
Four production-code fixes merit user-visible mention: the transport
chunking data-corruption fix (real user-affecting bug for multi-chunk
mesh payloads), the avatar u16 overflow panic (backend crash on certain
seeds), the outbox TTL boundary, and the image-versions parser hardening.
2026-04-23 13:03:49 -04:00
archipelago
5ddc30db1e test: repair stale test fixtures across identity, mesh, update, wallet, fips
Several tests had drifted from the current production behavior:

- identity_manager: create() already auto-provisions a Nostr key, so the
  explicit create_nostr_key() call failed with "already exists". Rewrite
  the test to assert on record.nostr_npub from create() directly.
- mesh/protocol: test_build_app_start read the app name from frame[4..]
  but the v2 layout is [0:marker][1-2:len][3:cmd][4:version][5..:name].
  test_identity_broadcast_roundtrip expected input DID = output DID but
  the v2 decoder derives DID from the ed25519 pubkey, so the roundtrip
  compares against did_key_from_pubkey_hex(&pub) now.
- mesh/bitcoin_relay: test_build_block_header_announcement asserted
  sig.is_some(), but the builder intentionally emits an unsigned envelope
  to fit the 160-byte LoRa limit; assert sig.is_none(). Also widen
  placeholder hashes to the required 64 hex chars (32 bytes).
- update: load_mirrors() now merges default mirrors post-migration, so
  the roundtrip test must assert the custom mirror survives alongside
  the defaults rather than strict equality.
- wallet/cashu: test_proof_c_as_pubkey used hex that is not on the curve;
  replace with the secp256k1 generator point G so parsing succeeds.
- fips: test_status_reports_no_key_pre_onboarding asserted npub.is_none(),
  which fails on dev boxes where the fips daemon is already running. Keep
  the !key_present assertion and drop the npub one.
2026-04-23 13:02:45 -04:00
archipelago
de9995f869 test(credentials): seed identity/node_key in test helper so encrypt/decrypt works
Credentials tests created a fresh tempdir and immediately invoked
encrypt/decrypt, but load_encryption_key reads <dir>/identity/node_key
which did not exist, so every test failed with "node key not found".
Add a test_dir_with_node_key() helper that writes a deterministic 32-byte
key and switch all 8 call sites to it.
2026-04-23 13:02:28 -04:00
archipelago
83dac52410 fix(session): add test-only constructor so tests do not read real sessions
SessionStore::new() reads /var/lib/archipelago/sessions.json, which on
any node with an active dashboard contains live sessions that pollute
test state and cause intermittent failures. Introduce a cfg(test) only
new_for_tests(PathBuf) constructor and switch the test suite to it so
tests always start from a clean tempdir.
2026-04-23 13:02:22 -04:00
archipelago
5439aa8ff1 fix(container/image_versions): reject entries that are not image references
The parser retained any key ending in _IMAGE, so a harmless-looking
variable like NOT_AN_IMAGE="something" would be treated as a pinned
container image. Add a value-shape check: the value must contain both
a registry separator (/) and a tag separator (:) to qualify.
2026-04-23 13:02:15 -04:00
archipelago
ebb5443309 fix(mesh/outbox): expire messages with zero TTL immediately
is_expired used age > ttl_secs, so a message with ttl_secs=0 whose age
rounded to 0 seconds was considered live forever. Switch to >= so the
zero-TTL boundary expires on the first check, matching the intuitive
meaning of TTL and the behavior the tests assert.
2026-04-23 13:02:07 -04:00
archipelago
a8862d4fe1 fix(avatar): prevent u16 overflow panic when seed byte is large
hue_color and accent_color computed (seed as u16) * 360, which overflows
u16 when seed >= 182 — debug builds panicked, release wrapped silently.
Widen to u32 before the multiplication.

This also unblocks several identity_manager tests that constructed avatars
through master_node_svg and were aborting on the panic.
2026-04-23 13:02:01 -04:00
archipelago
6d2fba1307 fix(transport/chunking): stop overwriting first 4 bytes of user data
encode_chunked() split the payload into shards first, then overwrote
the first 4 bytes of shard 0 with a u32 length header, then re-ran
Reed-Solomon to regenerate parity over the now-corrupted shards. The
decoder correctly read the length header and trimmed `[4..4+len]`
from the reconstructed buffer, but those first 4 bytes had already
been destroyed on the encode side, so every chunked mesh payload
lost its first 4 bytes.

Restructure: reserve 4 bytes for the length header up front, build
a single contiguous [len][data][pad] buffer, then split into shards.
Parity is computed over the correct shards on the first pass, no
double-encode needed.

Update test_chunk_roundtrip_medium: 500 bytes + 4-byte header = 504
bytes, which is 5 data shards (ceil(504/124)), not 4. The old test
assertion was wrong all along and masked the corruption bug because
it only checked the roundtripped bytes, which is exactly what we
need to verify. New assertion is correct.

Verified: all 7 transport::chunking tests pass.
2026-04-23 12:29:10 -04:00
archipelago
2270fc99ad docs(release-notes): v1.7.43-alpha bullet for install-log fix; prune stale RESUME note 2026-04-23 12:04:20 -04:00
archipelago
d15131d8a5 fix(install-log): pre-create /var/log/archipelago/ so non-root backend can write
The backend runs as `archipelago` and calls `install_log()` to append
audit lines to the install log on every install / update / remove /
start / stop / restart. Target path was /var/log/archipelago-container-installs.log,
which does not exist and cannot be created by the service because
/var/log/ is root-owned. OpenOptions errors were silently swallowed,
so the log was never written on any node.

Ship a tmpfiles.d rule that pre-creates /var/log/archipelago/ and
container-installs.log with archipelago:archipelago ownership. Move
the const path to match, keeping logs inside the directory logrotate
already rotates (image-recipe/configs/logrotate.conf). Install the
rule from both the ISO build and self-update, and apply it
immediately on self-update so existing nodes get a working log
without needing a reboot.

Verified on .228: file created, backend user can write, backend
binary rebuilt with new const.
2026-04-23 12:02:46 -04:00
archipelago
a1aacef974 docs(release-notes): v1.7.43-alpha bullet for self-update script refresh
Document that OTA updates now refresh the reconcile helper scripts,
closing the deploy gap that kept fixes to those scripts from
reaching existing nodes.
2026-04-23 11:51:04 -04:00
archipelago
6e2a45861e fix(self-update): install reconcile scripts on OTA updates
The OTA self-update path only refreshed image-versions.sh, leaving
reconcile-containers.sh and container-specs.sh frozen at whatever
version was baked into the ISO that originally provisioned the
node. Any fix to those scripts (notably the --create-missing flag
and the DISK_GB detection fix shipped this round) never reached
existing nodes, and on .228 both scripts were outright missing
because the node predated their inclusion in the ISO recipe.

Install all three helper scripts to /opt/archipelago/scripts/ on
every self-update run. Also preserve the legacy copy of
image-versions.sh at /opt/archipelago/image-versions.sh for any
older backend binaries still looking there first.
2026-04-23 10:07:53 -04:00
archipelago
8d5db4106e fix(update): pass --create-missing when rollback recreates a destroyed container
The update flow removes the old container before starting the new
one. If the update fails after removal, the rollback path tries
`podman start <name>` first, then falls back to reconcile. But
reconcile without --create-missing treats the now-absent container
as an optional one that the install flow will (re)create later,
and skips it. Result: container stays destroyed until someone
notices and runs reconcile manually.

Add --create-missing to the rollback reconcile invocation so the
fallback actually rebuilds the container from its canonical spec.

Fixes the failure mode observed on .228 where a bitcoin-knots
update left the node with no bitcoin-knots container at all.
2026-04-23 10:06:55 -04:00
archipelago
768ed47f45 docs(release-notes): v1.7.43-alpha bullets for disk-detection and rollback recovery
Add two user-facing release notes for fixes shipped this round:
- Full-archive Bitcoin nodes no longer silently get pruned on reconcile
  because the disk-size check was reading the OS partition.
- Failed updates can now recover via reconcile --create-missing instead
  of leaving a destroyed container behind.
2026-04-23 10:02:32 -04:00
archipelago
2a26576dbd fix(specs): measure DISK_GB at /var/lib/archipelago, not /
The reconcile spec for bitcoin-knots auto-enables prune=550 when
DISK_GB < 1000. DISK_GB was measured via `df /`, which on every
archy install reports the ~30 GB OS partition because user data
lives on a separate encrypted /var/lib/archipelago volume.

Result: every archy node with a 2 TB data drive was silently being
configured as a pruned node, and any bitcoin-knots container
recreated by reconcile would delete its historical blocks down to
the 550 MB prune window on next start.

Observed on .228 (2 TB box): blocks dir went from 384 GB to 926 MB
after a reconcile-triggered restart. Historical archive unrecoverable
without full re-IBD from genesis.

Fix: check /var/lib/archipelago first (where bitcoin data actually
lives). Fall back to / only on first-boot before the data partition
is mounted.
2026-04-23 09:54:16 -04:00
archipelago
68d9bed601 feat(reconcile): add --create-missing flag for recovering from failed-update rollbacks
Context: when package update fails after remove-old-container but
before reconcile-recreate, the rollback path in update.rs tries to
restart the old container by name. If the container is already gone
(removed in step 3 of the update), rollback fails silently and the
node is left with no live container for that app but on-disk data
still intact. This is exactly the state .228 ended up in after the
reconcile-script-missing bug killed bitcoin-knots and lnd.

Reconcile was designed to only repair existing containers for
optional apps (SPEC_OPTIONAL=true): it skips "not installed" entries
on the assumption that the install RPC creates them. That safety
check is correct for normal operation but blocks recovery when an
optional-marked container has been destroyed by a failed update.

Fix: add --create-missing flag that overrides the SPEC_OPTIONAL skip.
When set, reconcile treats absent containers exactly the same as
broken containers — it creates them from the canonical spec using
the existing on-disk data directory. Narrow-scope override; the
default behaviour is unchanged.

Updated --help to document all four flags.

Verified on .228: after the failed bitcoin-core update took out both
bitcoin-knots and lnd, running reconcile --container=bitcoin-knots
--create-missing --force (as the archipelago user, not root —
podman is rootless) brought bitcoin-knots back using the pruned
chainstate at /var/lib/archipelago/bitcoin. Repeated for lnd. All
containers now running; electrumx reconnecting; UIs recovering.

Does NOT fix the underlying update-flow rollback hole (rollback
should be able to re-create a container from spec, not just restart
by name). That is a separate commit — this flag is the manual
recovery tool plus the primitive the improved rollback will call.
2026-04-23 09:42:19 -04:00
archipelago
fdc035dda7 docs: release-note image-versions fix, add marketplace QA tracker, update RESUME
- AccountInfoSection.vue: append 5th bullet to v1.7.43-alpha entry
  explaining that update-available badges and version comparisons
  work again now that the pinned-image catalog is found at the
  correct deployed path.

- docs/MARKETPLACE-QA.md: new tracker for the upcoming app-by-app
  install walk on .228. Documents the per-app fix workflow, the
  four layers we might need to fix at (app recipe, registry image,
  backend orchestrator, frontend), status-key table for tracking
  each catalog entry, and the release-notes policy for the walk.

- docs/RESUME.md: refresh with a9908597 commit, updated binary md5
  on .228, and split Immediate Next Step into Phase 1 (browser
  verification) and Phase 2 (marketplace walk) with a pointer to
  the new tracker.
2026-04-23 09:32:41 -04:00
archipelago
a990859745 fix(image-versions): locate image-versions.sh at its actual deployed path
The Rust search path listed /opt/archipelago/image-versions.sh and
scripts/image-versions.sh (repo-relative for dev), but the image
recipe deploys the file to /opt/archipelago/scripts/image-versions.sh.
Production nodes therefore silently failed every lookup: find_file
returned None, load_image_versions returned an empty HashMap, and
both pinned_image_for_app and pinned_images_for_stack returned no
matches.

Symptom on deployed nodes: every container scan emitted
"image-versions.sh not found in any search path" at DEBUG level, and
the version-comparison logic in docker_packages.rs plus the
update-check logic in api/rpc/package/update.rs silently degraded to
no-op — users would not see update-available badges and upgrade RPCs
could not resolve pinned targets.

Fix: put the canonical deployed path first in PATHS, keep the older
/opt/archipelago/image-versions.sh as a fallback for not-yet-updated
nodes, and retain scripts/image-versions.sh as the dev-repo-relative
fallback. Verified on .228: backend now logs "Parsed 57 image
versions from /opt/archipelago/scripts/image-versions.sh" on scan.

Pre-existing test_parse_image_versions failure in this module is
unrelated (the NOT_AN_IMAGE assertion was broken before this change
because the parser's _IMAGE-suffix retain keeps it). Leaving that for
the general cargo-test cleanup pass.
2026-04-23 09:29:15 -04:00
archipelago
013e8df077 docs(resume): add RESUME.md for context-restart recovery
Consolidated single-file snapshot of plan + progress for a fresh
OpenCode session to pick up the install UX polish work:

- Where we are: v1.7.43-alpha shipped, 5 commits on main, deployed
  to .228, browser verification in progress.
- Immediate next step: await user's verification results from
  https://192.168.1.228/ browser checklist.
- Working layout: SSHFS mount, ssh archy / archy228, deploy recipes.
- Architecture patterns: async-spawn lifecycle, phase-based install
  progress, scanner kick, .23 auto-purge migration.
- Backlog: Vaultwarden exit-on-start, install log perms, 22 stale
  cargo test failures, historical changelog entries left intact.
- User preferences: "best long-term first", one-by-one, no push,
  Bitcoin-only, conventional commits.

Complements STATUS.md (which remains the engineering log) with a
tighter resume-the-work narrative focused on the current round.
2026-04-23 09:14:36 -04:00
archipelago
f9fef8d2cc docs(status): record rounds 3-5 + config migration + changelog as shipped
Adds a new top section to STATUS.md covering v1.7.43-alpha:

- Round 3: phase-based install progress bar
- Round 4: post-install scanner kick for instant Launch button
- Round 5: .23 VPS retirement, .168 promoted to Server 1
- Config migration: auto-purge .23 from saved registry/mirror JSONs
- Changelog: new v1.7.43-alpha entry in AccountInfoSection

All 5 commits, deployment md5, verification notes, and git remote
cleanup captured. Round 2 rollback command still valid for the full
stack since backups predate every round in this session.
2026-04-23 09:09:02 -04:00
archipelago
008da4776d docs(changelog): add v1.7.43-alpha entry covering async lifecycle + .23 retirement
Four release-note bullets describing the user-visible changes shipped
in this round:

- async-spawn install/update/uninstall (UI no longer freezes)
- phase-based install progress bar (Preparing through Finalizing)
- scanner kick post-install (Launch button appears immediately)
- .23 Hetzner VPS retired, .168 OVH promoted to Server 1 with
  auto-purge migration for existing nodes

Matches the tone of existing changelog entries: what changed from the
operator's perspective, not internal implementation detail.
2026-04-23 09:07:29 -04:00
archipelago
0ee1682037 fix(config): auto-purge decommissioned .23 VPS from saved registry/mirror configs
load_registries + load_mirrors normally only ADD missing defaults to
the persisted JSON — explicit removals stick. After retiring the .23
Hetzner VPS we need the opposite: existing nodes have .23 baked into
their saved configs and would spend seconds per install/update timing
out against a dead host until the operator manually removes it via
the Settings UI.

Add a targeted one-time migration in both loaders: if any saved entry
has 23.182.128.160 in its URL, drop it on load and rewrite the file.
This is an exception to the usual "explicit removals stick" rule —
the user never chose to add this mirror, it was a default.

Narrow-scope migration (one hardcoded IP match, no schema version)
because the cost/benefit of a general migration system isn't worth
it for a single decommissioned host. Future retirements can follow
the same pattern.
2026-04-23 08:51:26 -04:00
archipelago
2205232548 chore: retire .23 VPS mirror, promote .168 OVH to primary
The Hetzner VPS at 23.182.128.160 was decommissioned. Replace it
everywhere with the OVH VPS at 146.59.87.168, which was previously
the tertiary mirror.

  - update.rs: drop DEFAULT_TERTIARY_MIRROR_URL, promote .168 into
    the secondary slot as "Server 1 (OVH)"; tx1138 becomes Server 2.
    Default mirror list shrinks from 3 to 2.
  - container/registry.rs: default RegistryConfig drops .23, promotes
    .168 to Server 1 / priority 0, tx1138 stays Server 2 / priority 10.
  - api/rpc/package/config.rs: trusted-registry allowlist swaps .23
    for .168.
  - api/handler/mod.rs: app-catalog fallback URL uses .168.
  - neode-ui/views/marketplace/marketplaceData.ts: REGISTRY uses .168.
  - scripts/image-versions.sh: ARCHY_REGISTRY_FALLBACK uses .168.
  - image-recipe/build-auto-installer-iso.sh: installer ISO registries
    use .168 (both podman registries.conf and backend registries.json).

Tests updated to assert on the new 2-entry default lists (registry +
mirror). URL-parser fixture tests in update.rs retain .23 strings —
they exercise string-parsing logic, not mirror policy.

Git remotes: dropped `gitea-vps` and the .23 push URL on the `origin`
multi-push alias (not part of this commit — pure working-copy change).
2026-04-23 08:22:32 -04:00
archipelago
f86d86c354 fix(install): kick scanner post-install so Launch button appears immediately
After install completes, the async-spawn wrapper wrote state=Running
but the skeletal install-time manifest (interfaces: None) persisted
until the next scheduled 60s scan. The frontend saw state=running but
hasUI=false and hid the Launch button for up to a full minute.

Add a shared Notify/watch pair between RpcHandler and the scan loop:
  - scan_kick (Notify): scan loop selects! between the 60s interval
    and this notify, running immediately on either.
  - scan_tick (watch<u64>): scan loop bumps the counter after each
    completed scan so callers can await completion.

Install and update success paths now call kick_scanner_and_wait before
flipping to Running. The scan merges via merge_preserving_transitional
(state stays Installing/Updating, manifest refreshed from live podman
with interfaces.main.ui populated from real port bindings). 2s timeout
falls back to pre-fix behavior on slow podman — no regression.
2026-04-23 07:59:03 -04:00
archipelago
8cc84ebcb7 feat(install): phase-based progress bar replaces unparseable pull bytes
Podman emits zero parseable progress when stderr is piped (no TTY), so
the old byte-counter regex never matched in real installs. Users saw
0% for the whole pull, then a jump to 95%, then silence through
create-container, health-check, and post-install hooks.

Replace with 7 explicit lifecycle phases wired through install.rs and
update.rs: Preparing (5%), PullingImage (20%), CreatingContainer (70%),
StartingContainer (80%), WaitingHealthy (88%), PostInstall (95%),
Done (100%). Each maps to a fixed UI progress and status message.

Frontend PHASE_INFO mapper in stores/server.ts prioritizes phase when
present, falls back to byte-counter for legacy. A Math.max forward-only
guard ensures the bar never regresses. Deleted the duplicate watcher
in Discover.vue that was fighting the store's watcher with stale byte
logic. Added shimmer CSS on the fill (with prefers-reduced-motion
opt-out) so the bar looks alive during long phases.
2026-04-23 07:58:43 -04:00
archipelago
b2cc7e09d6 docs(status): mark install/uninstall/update async-spawn as shipped 2026-04-23 06:58:45 -04:00
archipelago
e471ef754e fix(rpc): empty icon in transient install entry to avoid broken-image flicker
create_installing_entry hardcoded /assets/img/app-icons/<id>.png for
every new install. About half the app icons ship as .svg or .webp
(lnd.svg, vaultwarden.webp, bitcoin-knots.webp, mempool.webp), so the
browser 404s on the wrong extension and renders the default broken-image
glyph for the 10-30s window before the scanner refreshes with real
manifest data.

Send empty icon. The frontend's icon computed in AppCard.vue falls
through to curatedMap which has correct extensions for bundled apps,
and handleImageError still guards any remaining misses with a
placeholder SVG.
2026-04-23 06:58:12 -04:00
archipelago
0733ac4034 fix(ui): shorten install/uninstall/update timeouts for async RPCs
With the backend flipped to async-spawn, install/uninstall/update return
immediately with a { status, package_id } envelope. Client timeouts of
45m/11m were a leftover from synchronous handlers and masked real RPC
failures.

Drop all install/uninstall/update RPC timeouts to 15s. Progress and
terminal state still arrive through the live state stream — the RPC
only needs to confirm the spawn was accepted.

Return-type annotations updated in rpc-client.ts and stores/server.ts.
Five direct rpcClient.call sites across Marketplace.vue, Discover.vue,
and MarketplaceAppDetails.vue updated with the shorter timeout.
2026-04-23 06:58:02 -04:00
archipelago
2d5b859e18 feat(rpc): async-spawn install/uninstall/update lifecycle
Extend the async-spawn treatment previously shipped for Stop/Start/Restart
to the three remaining long-running lifecycle RPCs. Each wrapper validates
params, rejects duplicate in-flight ops, flips state to the transitional
variant (Installing/Removing/Updating), then spawns the existing inner
handler on tokio. RPC returns immediately with { status, package_id }; the
spawn task owns the terminal state write.

Install and update success arms explicitly set state=Running. The scan
loop merge (merge_preserving_transitional) refuses to overwrite
transitional states, so the spawn task must write the terminal state.
Uninstall's inner handler removes the entry entirely, so no explicit
terminal write is needed there.

Dispatcher and handler now thread self as Arc<Self> / &Arc<Self> so
spawned tasks can hold their own Arc without extra field cloning.

Transient install entry uses empty icon string. Hardcoding
/assets/img/app-icons/<id>.png 404s for apps that ship .svg or .webp
assets, which produces a broken-image flicker until the scanner refreshes
with manifest data. Empty string causes the frontend's icon computed to
fall through to the curated map, which has correct extensions.

Removed the inner "already updating" guard in update.rs — the wrapper
now owns duplicate-op detection for all three operations.
2026-04-23 06:57:50 -04:00
archipelago
4f279388a1 docs(status): mark async-spawn lifecycle fix as shipped
Records the four landed commits, the .228 deploy (binary + frontend
paths, backups, md5), the manual LND Stop verification, and the
rollback incantation. Leaves the older "NEXT SESSION" design block
in place as historical reference with a note that it's stale.

Adds a follow-ups list: chaos matrix is now unblocked, bundled-app
RPCs are still sync (deprecate or mirror-async?), transitional_since
is in-memory only, and there are 22 pre-existing test failures in
unrelated modules that should get their own cleanup pass.
2026-04-23 05:30:45 -04:00
archipelago
9ce28f080e fix(ui): single-button lifecycle control with transitional labels
The app card and details view previously used a pair of Start/Stop
buttons whose labels were driven off isAppLoading(), a client-side
"I just clicked the button" flag. When the backend's graceful stop
took longer than the RPC round-trip (up to 600s on bitcoin-core),
the flag cleared while the container was still shutting down, the
UI flipped back to "Running" as soon as the next 10s scan saw the
still-alive container, and the user had no indication the stop was
still in flight.

Now that the backend flips PackageState to Stopping / Starting /
Restarting / Installing / Updating / Removing for the duration of
each lifecycle operation and the scan loop preserves those states,
the UI can drive its label off the container state itself. A single
full-width primary button replaces the Start/Stop pair. Its label,
color, and disabled state come from getAppVisualState(), which
collapses resting states (exited/created/paused/installed) into
"stopped" and passes transitional states through untouched.

Changes:

- container-client.ts: widen ContainerStatus.state union to include
  the six transitional variants plus "installed". Add
  restartContainer() calling the new container-restart RPC.
- stores/container.ts: add getAppVisualState() computed and the
  restartContainer() action.
- ContainerApps.vue: single primary button (Start / Stop / Starting
  / Stopping / Restarting etc.) plus a separate circular Restart
  button visible only when running. Critically, handleStartApp and
  handleStopApp now route through store.startContainer and
  stopContainer (which call container-start / container-stop, the
  async RPCs) instead of the legacy synchronous bundled-app-start /
  bundled-app-stop path. Transitional-state polling widened from
  just "created" to the full set of transitional variants.
- ContainerAppDetails.vue: same single-button pattern, Restart
  button now calls container-restart instead of the old
  stop-sleep-start sequence, added 2s polling interval for
  transitional states.
- components/ContainerStatus.vue: widen state prop to match the
  shared union, render transitional labels with a trailing ellipsis
  and a yellow dot.

No new tests — this is presentation logic. Manual verification on
.228 will confirm the end-to-end async path: click Stop on LND,
button becomes "Stopping" in under a second, stays that way for
roughly 5 minutes, then flips to "Start" with a grey dot. The UI
must never revert to "Running" mid-stop.
2026-04-23 05:20:15 -04:00
archipelago
6712810b92 fix(state): preserve transitional state across container scans
The 30s package scan loop used to blindly overwrite every package
entry from podman inspect. While a user-initiated Stop / Start /
Restart was in flight, the RPC spawn task would flip the state to
Stopping / Starting / Restarting, the next scan would see podman
still reporting "running" (for the duration of the graceful stop,
up to 600s for bitcoin-core), and clobber the transitional state
back to Running. The dashboard would then flip Running -> Stopping
-> Running -> Stopped, making it look like the stop had silently
failed until it eventually completed.

The merge loop now treats transitional variants (Stopping, Starting,
Restarting, Installing, Updating, Removing, and the three backup
variants) as owned by the RPC spawn task. For those variants,
merge_preserving_transitional keeps the existing state while still
taking live observability fields (health, exit_code, installed,
lan_address, manifest, static_files, available_update) from the
fresh scan so the UI continues to see live health readings.

Adds an escape hatch via a per-scan transitional_since side table:
if a package has been in a transitional state for more than 1200s
(2x the longest graceful stop at 600s on bitcoin-core), the scan
loop assumes the spawn task died without cleanup and overrides with
podman's live state. Prevents a crashed background task from wedging
a package in Stopping forever.

Three unit tests cover the merge rule, the observability passthrough,
and the transitional-variant classifier.
2026-04-23 05:15:13 -04:00
archipelago
19a99ca993 fix(rpc): async container stop/start/restart; widen state mapping
RPC handlers no longer block on podman operations. container-stop on
bitcoin-core used to hold the connection for up to 600s while the UI
showed a frozen spinner; it now returns in under a second with
{status: stopping} after flipping the package state to Stopping and
broadcasting over WebSocket. Same treatment for container-start and
the new container-restart route.

Widens container-list state mapping to emit the transitional variants
(stopping, starting, restarting, installing, updating, removing,
installed, and the backup states) instead of collapsing them to
"unknown". Keeps the mapping in sync with the UI ContainerStatus.state
union so the dashboard can render the right transitional label.

Mirrors the treatment in package/runtime.rs for package.start,
package.stop, and package.restart. The body of each handler is lifted
into pure do_package_* helpers that the background task runs; state
flipping is bracketed around the spawn with revert on error. The
pre-existing post-start exit-check verification and restart stop+start
fallback run inside the spawned task, not the RPC body.

Adds container-restart route to the dispatcher. mark_user_stopped
continues to run BEFORE the spawn, preserving the ordering contract
with the crash recovery layer at runtime.rs:145-148.
2026-04-23 04:59:45 -04:00
archipelago
44cd5eefdf feat(rpc): spawn_transitional helper for async lifecycle ops
Introduces a new RPC-layer helper that bridges the synchronous
ContainerOrchestrator trait with RPC handlers that must return in <1s.

The helper flips the package state to a transitional variant
(Stopping / Starting / Restarting) in the StateManager so WebSocket
clients see the live label immediately, then tokio::spawns the
actual orchestrator call. On success it writes the final state; on
error it reverts to the pre-transition state and logs via
install_log().

The ContainerOrchestrator trait stays synchronous so the reconciler,
boot flow, unit tests, and chaos harness keep deterministic
behaviour. Async only lives in the RPC layer.

Not wired to any handler yet — Commit 2 consumes this helper.
Widens install_log visibility from pub(super) to
pub(in crate::api::rpc) so the new sibling module can reach it.
2026-04-23 04:55:52 -04:00
archipelago
f721ecf39b docs: STATUS.md — FUSE/SSHFS development loop section
Dedicated section covering the file-ops-via-mount + git/cargo-via-ssh
split that makes this dev setup work. Includes:

- Exact running mount command (pulled from ps)
- macFUSE + sshfs-mac brew install path
- Health check + recovery sequence for when mount hangs (it will)
- Full which-path-for-which-operation table
- Don't-do list (cargo from mount, rsync without AppleDouble exclude, etc)
- Cache caveat and inode-sharing note between mount and SSH views

No code change.
2026-04-23 04:51:53 -04:00
archipelago
120a307343 docs: STATUS.md — complete SSH/key/sudo/deploy reference for next session
Expands NEXT SESSION header with fully verified access info so a fresh
agent has zero ambiguity:

- SSH key inventory across laptop, .116, .228 (every file, purpose noted)
- Actual SSH config aliases (archy, archy228) with IdentitiesOnly
- Verified connectivity matrix (laptop -> both; .116 -> .228; .228 has no outbound key)
- Corrected sudo state: .228 sudoers file is /etc/sudoers.d/archipelago
  (not archipelago-ci); .116 has archipelago-ci + archipelago-wg scope-limited drop-ins
- SSHFS mount source command + AppleDouble gotcha
- Cargo over SSH PATH gotcha + detached build pattern for >2min timeout
- End-to-end deploy-to-.228 recipe (build, SCP, atomic swap, verify)
- Git workflow rules (no push, no amend, no force, conventional commits)

Removes duplicate host-reference block that the prior edit left trailing.
No code change.
2026-04-23 04:49:45 -04:00
archipelago
e557e0156f docs: STATUS.md — dashboard Stop UX bug diagnosis + async-spawn fix plan
Captures full design for the next session:
- Full bug sequence (5.5min blocking RPC + 30s scan clobbering transitional state)
- 4-commit implementation order with exact file:line targets
- Single-button UI spec with full label table
- Verification gates including manual LND stop test on .228
- Architectural decision: spawn lives in RPC layer, orchestrator trait stays sync

No code change yet; next session implements.
2026-04-23 04:45:12 -04:00
archipelago
1ab66f33a3 docs: STATUS.md — .228 dashboard bugs fixed (macaroon + ExtraHost) 2026-04-23 04:17:56 -04:00
archipelago
3ee192ba1f fix(first-boot): use podman host-gateway magic for host.containers.internal
The previous code computed HOST_GATEWAY from `ip route show default` to
work around an alleged podman 4.3.x limitation. Two problems:

1. The comment was wrong. Podman 4.4+ supports --add-host=host-gateway
   natively, and we ship 5.4.2.

2. More critically, `ip route show default` returns the LAN router
   (e.g. 192.168.1.254) — the gateway to the internet, not the gateway
   to the host. Every container configured with DAEMON_URL or
   --bitcoind.rpchost=host.containers.internal was therefore dialing
   the WiFi router instead of the host machine, silently failing.

Symptoms this caused on .228:
- LND crash-looped with "dial tcp 192.168.1.254:8332: connection refused"
- Dashboard showed no LND connect details or QR
- ElectrumX DAEMON_URL broken; stuck at 2 KB index for days
- Any service reaching bitcoin-core through the `archy-net` bridge

Replace the computed value with the literal string "host-gateway",
which podman translates to the correct in-network gateway at container
start. Also drop the stale HOST_GATEWAY reference in the Tor-bootstrap
branch (it always fell back to TARGET_IP anyway). Verified on .228:
after recreating bitcoin-core/electrumx/lnd with the new flag, LND
reached the chain backend, ElectrumX resumed indexing, and the
dashboard /lnd-connect-info endpoint succeeded.
2026-04-23 04:16:42 -04:00
archipelago
be96002372 fix(lnd): read admin macaroon via sudo fallback
LND's admin.macaroon is owned by a rootless-podman subordinate UID
(typically 100000) with mode 640. The archipelago server runs as UID
1000 and cannot read the file directly, which caused every dashboard
LND RPC (getinfo, connect-info, export-channel-backup) and lnd_client
to fail with "Failed to read LND admin macaroon".

Add a read_lnd_admin_macaroon() helper that first tries a direct read
(for operators who have relaxed permissions) then falls back to
`sudo -n cat`, mirroring the pattern already used for Tor hidden
service hostnames in handle_lnd_connect_info. Centralise the canonical
macaroon path as LND_ADMIN_MACAROON_PATH and route all four callers
through the helper.

Verified on .228: GET /lnd-connect-info now returns 200 with cert,
macaroon, and tor_onion fields. Dashboard QR/connect-string UI
unblocked.
2026-04-23 04:15:44 -04:00
archipelago
4b8ef0a098 docs: STATUS.md through Step 9 (.228 hot-swap verified)
Logs Step 9 acceptance evidence, the two bugs caught and fixed during
the hot-swap (parse_memory_limit IEC suffix bug in 732df1b8 and
cgroup Delegate in ba83f9bc), and outlines the Step 10 plan for .116.
2026-04-23 03:46:23 -04:00
archipelago
ba83f9bce2 feat(systemd): delegate cgroup controllers to archipelago.service
Adds Delegate=memory pids cpu io to the archipelago.service unit.

Context: the service runs as User=archipelago under system.slice with
rootless podman. When podman creates transient libpod-*.scope units for
containers under user.slice, systemd needs the caller to hold
CAP_SYS_ADMIN on the target cgroup subtree \u2014 which happens iff
Delegate= lists the controllers we want to set. Without Delegate, any
future code path that goes through the podman CLI (runtime.rs) instead
of the libpod HTTP API (podman_client.rs) would hit MemoryMax
rejections that have exactly the same symptom as the bug I just fixed
in parse_memory_limit but with a completely different root cause.

Belt-and-braces: current production path uses PodmanClient and was
fixed in the preceding commit. But the DockerRuntime CLI path in
runtime.rs:262-268 (cmd.arg("--memory")) is still reachable via
AutoRuntime fallback on hosts without podman, and future rust
orchestrator code may legitimately need cgroup delegation. This
directive is no-op harmful on hosts that already delegate upstream
(systemd gracefully handles duplicate/nested delegation).
2026-04-23 03:44:36 -04:00
archipelago
732df1b8cb fix: parse_memory_limit accepts Ki/Mi/Gi IEC binary suffixes
The libpod HTTP API path (PodmanClient::create_container) ran manifest
memory_limit values like "128Mi" through parse_memory_limit which
lowercased+trim_end_matches("m"), leaving "128i" which parse::<f64>()
rejected. The resulting None became 0 via .unwrap_or(0), and podman
serialised that into the OCI config as memory.limit:0. At container
start time systemd then rejected MemoryMax=0 with "Value specified in
MemoryMax is out of range".

Silently wrong for every manifest in apps/ that uses Kubernetes-style
suffixes (all of them). Became visible on .228 when Step 9 first
exercised the ProdContainerOrchestrator path for bitcoin-ui and lnd-ui
installs \u2014 the old first-boot-containers.sh bash script used podman
run --memory 128m directly, which podman-the-CLI parses correctly, so
the bug never surfaced before.

Two parts:
- parse_memory_limit now recognises Ki/Mi/Gi/Ti (IEC binary, what k8s
  and our manifests use), kB/MB/GB/TB (SI decimal), k/K/m/M/g/G/t/T
  (docker shorthand, treated as IEC binary for backwards compat), and
  bare byte integers. Filters out zero/negative results.
- create_container omits the memory/cpu fields entirely when the
  manifest has no limit or parsing fails, rather than emitting 0. The
  libpod API treats absent as unlimited; 0 is "set MemoryMax=0" which
  systemd rightly rejects. Defence in depth against the next weird
  suffix someone puts in a manifest.

Six regression tests in the new tests module cover IEC, SI, shorthand,
raw bytes, invalid input (empty/garbage/0/negative), and whitespace.
2026-04-23 03:44:23 -04:00
archipelago
a0707f4d48 feat(iso): Step 8a — retire archipelago-reconcile systemd timer
BootReconciler (in-process, 30s interval, spawned from main.rs as of
Step 6 commit 48f08aa3) fully replaces the timer-driven bash
reconciliation path. Delete the systemd unit + timer and their
ISO-builder touchpoints.

Removed:
- image-recipe/configs/archipelago-reconcile.service
- image-recipe/configs/archipelago-reconcile.timer
- image-recipe/build-auto-installer-iso.sh L412-413 (COPY unit+timer)
- image-recipe/build-auto-installer-iso.sh L449 (systemctl enable)
- image-recipe/build-auto-installer-iso.sh L542-543 (cp to WORK_DIR)

Kept (intentionally):
- scripts/reconcile-containers.sh
- scripts/container-specs.sh

Reason: core/archipelago/src/api/rpc/package/update.rs still invokes
reconcile-containers.sh at two sites (OTA update + rollback paths).
Porting those call sites to ContainerOrchestrator::upgrade() requires
manifests for every container update.rs might touch — that scope
belongs in Step 8b. Until then the script stays on disk, just no
longer runs on a periodic timer.

No Rust code changes. cargo check -p archipelago clean, 6 pre-existing
warnings. Skipped full ISO rebuild validation per user decision —
edits are 5 textual deletions with zero behavioral ambiguity; Step 9
live hot-swap on .228 will catch any regression.
2026-04-23 03:04:58 -04:00
archipelago
1c81a739d6 docs: split Step 8 into 8a/8b/8c
Discovered during Step 8 execution that first-boot-containers.sh
creates 30+ containers with per-container logic (wallet loads, DB
init, rpcauth derivations, post-create health waits) and does
substantial non-container setup (secret gen, rootless-podman subuid
chowns, Tor hostnames, WireGuard, firewall, nostr-relay). Only 3 of
the 30+ containers have manifests today (the UIs from Step 7).

Deleting the bash in a single step bricks first-boot on fresh
installs. Split into:

- 8a: delete reconcile-containers.sh + container-specs.sh + reconcile
  systemd unit + timer. BootReconciler fully covers these. Safe,
  atomic, no manifest porting required.
- 8b: port remaining ~25 containers into apps/<id>/manifest.yml. One
  manifest per commit, validated against current bash behavior.
  Multi-day scope.
- 8c: rename first-boot-containers.sh -> first-boot-setup.sh, strip
  container ops, keep secret/dir/Tor/WG/firewall setup. Final
  one-way door, requires 8b complete.
2026-04-23 02:34:43 -04:00
archipelago
6e46932f72 docs: STATUS.md through Step 7 2026-04-23 02:21:01 -04:00
archipelago
069bc4a561 feat(container): bitcoin-ui pre-start hook renders nginx.conf from embedded template
Replaces the first-boot-containers.sh sed/envsubst approach with a
Rust-native render step bound into the ContainerOrchestrator lifecycle.

- New container::bitcoin_ui module: embeds the nginx.conf template via
  include_str!, reads the plaintext RPC password from
  /var/lib/archipelago/secrets/bitcoin-rpc-password, substitutes
  {{BITCOIN_RPC_AUTH}} with base64(archipelago:<password>), and atomic-
  writes (tmp + rename) to /var/lib/archipelago/bitcoin-ui/nginx.conf.
  Idempotent: byte-compares before writing so unchanged input is a
  no-op (no inode churn, no restart cascade).
- ProdContainerOrchestrator gains run_pre_start_hooks(app_id) returning
  HookOutcome::{Rewritten, Unchanged}. Fires in install_fresh before
  create_container, and in ensure_running: on Running + Rewritten
  triggers a restart; on Stopped re-renders then starts.
- bitcoin-ui Dockerfile no longer COPYs a default.conf; the file now
  arrives via runtime bind-mount of the rendered config. If the bind-
  mount is ever missing, nginx starts with no site configured and
  returns 404 everywhere — safe failure vs. serving upstream RPC with
  a stale Authorization header.
- apps/{bitcoin,electrs,lnd}-ui/manifest.yml land as first-class
  manifests. bitcoin-ui declares the bind-mount target and a dependency
  on bitcoin-core; electrs-ui and lnd-ui declare their own deps and
  health checks.
- 8 new unit tests on the render fn (idempotency, rotation, trimming,
  missing/empty secret, template invariants) plus an integration test
  asserting install(bitcoin-ui) actually lands a substituted nginx.conf
  on disk via the hook. 39/39 container:: tests pass
  (test_parse_image_versions pre-existing failure unchanged, out of
  scope).
2026-04-23 02:19:52 -04:00
archipelago
ca734e4ea6 docs: STATUS.md through Step 6 2026-04-22 19:20:17 -04:00
archipelago
48f08aa3e4 feat(container): wire ProdContainerOrchestrator + BootReconciler into main
Step 6 of the rust-orchestrator migration. Construct the container
orchestrator once in main.rs, call load_manifests + adopt_existing
immediately after Config::load, log the adoption report, and spawn
BootReconciler::run_forever with the 30s default interval. Thread the
orchestrator through Server::new -> ApiHandler::new -> RpcHandler::new
so the reconciler and RPC layer share one instance.

Wire a tokio::sync::Notify through the SIGTERM/SIGINT shutdown path so
the reconciler exits cleanly alongside the server drain. Uses notify_one
so the signal stores a permit if the reconciler is mid reconcile_all
when the signal fires.

Delete the commented-out run_boot_reconciliation block in main.rs that
documented the prior bash-script approach being unsafe on unbundled
installs — the new reconciler is manifest-driven and only touches apps
present in /opt/archipelago/apps, fixing that concern.

cargo check -p archipelago clean (6 pre-existing dead-code warnings on
trait methods not yet exercised until Step 9 hot-swap). Container test
suite 43/44 pass; the one failure (container::image_versions::
test_parse_image_versions) is pre-existing and unrelated.
2026-04-22 19:20:13 -04:00
archipelago
fc39b04b4e feat(container): BootReconciler — periodic reconcile loop for prod orchestrator
Step 5 of the rust-orchestrator migration. New file boot_reconciler.rs holds a
small Tokio task that calls ProdContainerOrchestrator::reconcile_all() on a
30-second cadence (answered design Q3).

  * BootReconciler::new(orch, interval, shutdown) — shutdown is an Arc<Notify>
    so callers can trigger a graceful exit without pulling in tokio-util.
  * run_forever(self) — does one reconcile immediately, then loops on
    tokio::select! { sleep_until | shutdown.notified() }. Shutdown interrupts
    the sleep but never an in-flight reconcile_all call.
  * Per-pass outcomes are logged at debug/warn; failures never propagate out
    because reconcile_all already absorbs per-app errors into ReconcileReport.

Four tokio::test(start_paused = true) tests verify the loop cadence against a
CountingRuntime test double:
  * initial_pass_fires_immediately — first reconcile runs with no delay
  * second_pass_fires_after_interval — second pass fires after exactly
    interval elapses in paused-clock time
  * shutdown_terminates_loop — notify_one() lets run_forever return
  * failure_in_one_pass_does_not_stop_loop — the loop keeps ticking even when
    the first pass had to install a missing container

Not wired into main.rs yet — that is Step 6. Re-exported from container::mod
as BootReconciler + RECONCILER_DEFAULT_INTERVAL for the wire-up step.
2026-04-22 19:04:34 -04:00
archipelago
d7692790bc docs: update STATUS.md — Step 4 done, Step 5 next
Records acceptance evidence for Steps 1-4 (container tests 21/21 pass, build
clean with expected unused-method warnings) and queues the BootReconciler
implementation for Step 5.
2026-04-22 18:57:43 -04:00
archipelago
138588422a chore: gitignore macOS AppleDouble files from SSHFS writes
The laptop mounts ~/Projects/archy over SSHFS and macOS finder / Spotlight
sidecars write ._<name> resource-fork files alongside every edit. They are
noise; keep them out of git.
2026-04-22 18:56:58 -04:00
archipelago
e8a59c93c6 feat(container): ContainerOrchestrator trait, RpcHandler uses it in prod
Step 4 of the rust-orchestrator migration. Unifies the container lifecycle
surface behind a single trait so the RPC layer stops caring whether it is
talking to the dev or prod orchestrator.

  * New trait core/archipelago/src/container/traits.rs: ContainerOrchestrator
    with install / start / stop / restart / remove / upgrade / status / list /
    logs / health, all keyed by app_id. Every method is async_trait-based.

  * ProdContainerOrchestrator: the lifecycle methods are moved from inherent
    impl into the trait impl (avoids name-shadowing recursion). Adoption and
    reconcile remain inherent since only main.rs / BootReconciler call them.

  * DevContainerOrchestrator: new trait impl that forwards to the existing
    Dev-named methods, applying the dev container-name + port-offset rules
    internally. New load_manifest_for() helper resolves app_id to
    <data_dir>/apps/<app_id>/manifest.yml so trait-level install(app_id)
    works in dev too. install_container(manifest, path) stays inherent for
    the manifest-path RPC shape.

  * RpcHandler now holds Option<Arc<dyn ContainerOrchestrator>> and, when in
    dev mode, a separate Option<Arc<DevContainerOrchestrator>> for the
    manifest_path install RPC. In prod mode RpcHandler::new() constructs a
    ProdContainerOrchestrator and calls load_manifests() at startup.

  * All seven container-* RPC guards no longer say dev mode required.
    container-install still requires dev mode because its manifest_path
    argument has no prod meaning; every other container RPC now works in both
    modes via the trait.

BOOT STILL DOES NOT USE THIS. main.rs wire-up (Step 6) and BootReconciler
(Step 5) come next. Until then the prod orchestrator is constructed but nothing
populates /opt/archipelago/apps so it has zero manifests to manage, matching
the pre-Step-4 behaviour.

Verification: cargo build -p archipelago clean (11 expected unused method
warnings for methods not yet wired from main.rs). cargo test -p archipelago:
all 21 container::* tests pass (16 prod_orchestrator + 5 others). 24 other
test failures are pre-existing and unrelated (identity_manager / session /
wallet / mesh / credentials — all independently flaky on file-backed state).
2026-04-22 18:56:52 -04:00
archipelago
b6a04d315a feat(container): ProdContainerOrchestrator with build-or-pull, adoption, reconcile
Step 3 of the rust-orchestrator-migration. New file prod_orchestrator.rs (999 LOC)
implements the full public surface that will replace scripts/first-boot-containers.sh:

  * install / start / stop / restart / remove / upgrade / status / list / logs / health
  * adopt_existing: read-only scan that claims containers matching our manifests by
    name, without recreating — preserves the v1.7.42 fixture on .116.
  * reconcile_all: level-triggered, per-app failures collected rather than aborting.
  * install_fresh: build-or-pull (Step 2 trait methods), relative build contexts
    resolved against the manifest directory.

Naming rule (answered design Q1): UI app IDs (bitcoin-ui/electrs-ui/lnd-ui) get the
archy- prefix; backends keep their bare ID. An explicit extensions.container_name
always wins. Codified in compute_container_name() with unit tests for all three tiers.

Concurrency (answered design Q4): per-app tokio::sync::Mutex<()> created lazily,
protecting every mutating op against the reconciler loop. Acquiring the per-app
lock only needs a read lock on the map, so independent apps do not serialize.

16 tests: 3 sync naming rule tests + 13 tokio async tests covering install (pull,
build-absent, build-present, relative-context), reconcile (noop/exited/missing/
mixed-failure), adopt-by-name, upgrade sequence ordering, list filtering, health
state mapping, and unknown-app-id rejection. All pass.

Not wired into main.rs yet — that is Step 6. Crate builds clean with expected
unused warnings for the new re-exports.
2026-04-22 18:32:31 -04:00
archipelago
34af4d9d4e feat(container): runtime trait gains image_exists + build_image
Adds two methods to ContainerRuntime so the upcoming ProdContainerOrchestrator
can inspect local image storage and build images from BuildConfig:

- image_exists(image_ref) -> Result<bool>: local-storage check only, does
  not consult registries. Distinguishes exit 0 (present) from exit 1
  (absent) from other failures (environment error).
- build_image(&BuildConfig) -> Result<()>: shells out to podman/docker
  build with -t, -f, deterministically-sorted --build-arg pairs, and the
  context path last.

Implemented on all three runtimes:
- PodmanRuntime: new podman_cli helper shells out alongside the existing
  HTTP API calls (build and image inspect are awkward over the HTTP API)
- DockerRuntime: native docker CLI, same exit-code semantics
- AutoRuntime: delegates to the selected inner runtime

Argv construction extracted into pure build_args_for_podman helper so it
can be unit-tested without a real podman. 4 new tests cover minimal args,
custom Dockerfile path, deterministic build-arg sorting (guards against
HashMap iteration non-determinism), and context-is-last (positional arg
placement is load-bearing for podman build).

Step 2 of docs/rust-orchestrator-migration.md. 25/25 tests pass.
2026-04-22 17:46:47 -04:00
archipelago
3767c2670c feat(container): add build source to manifest schema
ContainerConfig.image is now Option<String>, mutually exclusive with a new
optional ContainerConfig.build: Option<BuildConfig>. Exactly one of image
or build must be present, enforced in AppManifest::validate.

Adds ResolvedSource enum (Pull | Build) and ContainerConfig::resolve +
::image_ref helpers so the orchestrator can treat pull and build uniformly.
All 26 existing pull-only manifests continue to parse unchanged
(covered by existing_pull_only_manifests_still_parse test).

Call sites updated: podman_client, runtime::DockerRuntime, dev_orchestrator.
Dev orchestrator errors out cleanly on Build sources until Step 2 lands
build_image support on the runtime trait.

Step 1 of docs/rust-orchestrator-migration.md. 10 new unit tests, all pass.

Also includes: docs/rust-orchestrator-migration.md (design spec) and
docs/STATUS.md resume section for the next session.
2026-04-22 17:46:36 -04:00
177 changed files with 13386 additions and 3157 deletions

4
.gitignore vendored
View File

@@ -73,3 +73,7 @@ loop/loop.log.bak
# Separate repos nested in tree
web/
._*
# Resilience harness reports (generated, contains session cookies)
scripts/resilience/reports/

View File

@@ -1,5 +1,37 @@
# Changelog
## v1.7.47-alpha (2026-04-29)
- Bitcoin Knots/Core sync is now significantly faster. The container now uses every available core for script verification (was capped at 2) and has 8GB of memory instead of 4GB so its 4GB UTXO cache has headroom for the mempool and peer connections. Existing nodes pick up the new limits on next install/update; freshly-installed nodes start at full speed.
- ElectrumX initial indexing is faster too. Its container memory bumped from 1GB to 2GB and its internal cache is now 2GB (default was 1.2GB).
## v1.7.46-alpha (2026-04-29)
- Health monitor no longer pages "Auto-restart failed" for orphaned containers. After a variant switch (bitcoin-core ↔ bitcoin-knots) the previous variant's container could survive uninstall and the health monitor would try restarting it forever. Now skipped silently with a debug log.
- Apps no longer disappear from My Apps when an install fails. The card stays visible with state=Stopped so the user can retry or uninstall, with the failure reason surfaced via the new install_progress.message field.
- "Downloading…" progress now actually advances during multi-image stack pulls. Was sticking at 20% until all pulls finished; now interpolates 20%→70% based on which image of N has landed.
- Pulled four docker.io images (bitcoin, gitea, nextcloud, valkey) into the lfg2025 registries on OVH and tx1138. Removes a docker.io dependency from first-boot installs.
- Resilience harness improvements: install-fail entries no longer vanish, install/uninstall/probe cells are timing-tolerant (60s retry on ui_probe and auth_probe), dep snapshots no longer leak companion containers into the dependent app's "new containers" set.
## v1.7.45-alpha (2026-04-29)
- Bitcoin RPC auth is durable. The dashboard reliably connects across container restart, image update, and reboot. Was failing on registry-pulled images that shipped a stale baked-in password.
- Multi-container apps show real install progress. IndeedHub (7), BTCPay (4), Mempool (3), Immich (3) — bar advances through Preparing → Pulling → Creating → Done instead of sitting at 0% until the very end.
- Apps no longer disappear from the dashboard mid-install. The container scanner now respects in-flight installs and updates instead of evicting an entry while its containers are still being created.
- IndeedHub installs cleanly on a fresh node. Five missing environment variables fixed; Nostr sign-in works on first install.
- Tailscale install no longer fails with "executable not found". Container command was a malformed shell string; now a proper command array.
- Removed three catalog entries that hung installs for ten minutes (dwn, endurain, ollama — no source images in our registries). Restored Nextcloud, sourced from docker.io.
- Bitcoin Core update path uses the correct image name (was pulling from a non-existent path).
- New ISO installs now allocate swap (sized to RAM, capped at 8GB, on the encrypted data partition). Without swap, container image builds and memory spikes were hitting OOM under load.
## v1.7.44-alpha (2026-04-28)
43de3b73 feat(orchestrator): complete container migration and release hardening
ce39430b feat(self-update): sync and rebuild UI containers on OTA
72dec5aa fix(lnd-ui): align container port across all specs
83aacdf2 chore(release): archive ISO build recipes, tarball-only releases
All notable changes to Archipelago will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),

View File

@@ -101,14 +101,20 @@ npm run build # Production build → web/dist/neode-ui/
./scripts/deploy-to-target.sh --both # Deploy to both LAN servers
```
### Build ISO
### Release (tarball-only)
Releases ship as a backend binary and a frontend tarball referenced by
`releases/manifest.json`. Nodes OTA-update via `scripts/self-update.sh`.
```bash
ssh archipelago@<server>
cd ~/archy/image-recipe
sudo ./build-auto-installer-iso.sh
./scripts/create-release.sh 1.2.3
git push gitea-local main --tags
git push gitea-vps2 main --tags
```
ISO builds are archived under `image-recipe/_archived/` and not part of the
release deliverable.
## Architecture
```

View File

@@ -1,7 +1,7 @@
{
"version": 2,
"updated": "2026-04-22T00:00:00Z",
"registry": "git.tx1138.com/lfg2025",
"registry": "146.59.87.168:3000/lfg2025",
"featured": {
"id": "indeedhub",
"banner": "/assets/img/featured/indeedhub-banner.jpg",
@@ -11,200 +11,260 @@
},
"apps": [
{
"id": "bitcoin-knots", "title": "Bitcoin Knots", "version": "28.1.0",
"id": "bitcoin-knots",
"title": "Bitcoin Knots",
"version": "28.1.0",
"description": "Run a full Bitcoin node. Validate and relay blocks and transactions.",
"icon": "/assets/img/app-icons/bitcoin-knots.webp",
"author": "Bitcoin Knots", "category": "money", "tier": "core",
"dockerImage": "git.tx1138.com/lfg2025/bitcoin-knots:latest",
"author": "Bitcoin Knots",
"category": "money",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/bitcoin-knots:latest",
"repoUrl": "https://github.com/bitcoinknots/bitcoin"
},
{
"id": "bitcoin-core", "title": "Bitcoin Core", "version": "28.4",
"id": "bitcoin-core",
"title": "Bitcoin Core",
"version": "28.4",
"description": "Reference implementation of the Bitcoin protocol. Run a full node validating and relaying blocks.",
"icon": "/assets/img/app-icons/bitcoin-core.svg",
"author": "Bitcoin Core contributors", "category": "money", "tier": "optional",
"dockerImage": "docker.io/bitcoin/bitcoin:28.4",
"author": "Bitcoin Core contributors",
"category": "money",
"tier": "optional",
"dockerImage": "146.59.87.168:3000/lfg2025/bitcoin:28.4",
"repoUrl": "https://github.com/bitcoin/bitcoin"
},
{
"id": "lnd", "title": "LND", "version": "0.18.4",
"id": "lnd",
"title": "LND",
"version": "0.18.4",
"description": "Lightning Network Daemon. Fast Bitcoin payments through Lightning.",
"icon": "/assets/img/app-icons/lnd.svg",
"author": "Lightning Labs", "category": "money", "tier": "core",
"dockerImage": "git.tx1138.com/lfg2025/lnd:v0.18.4-beta",
"author": "Lightning Labs",
"category": "money",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/lnd:v0.18.4-beta",
"repoUrl": "https://github.com/lightningnetwork/lnd",
"requires": ["bitcoin-knots"]
"requires": [
"bitcoin-knots"
]
},
{
"id": "btcpay-server", "title": "BTCPay Server", "version": "1.13.7",
"id": "btcpay-server",
"title": "BTCPay Server",
"version": "1.13.7",
"description": "Self-hosted Bitcoin payment processor.",
"icon": "/assets/img/app-icons/btcpay-server.png",
"author": "BTCPay Server Foundation", "category": "commerce", "tier": "core",
"dockerImage": "git.tx1138.com/lfg2025/btcpayserver:1.13.7",
"author": "BTCPay Server Foundation",
"category": "commerce",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/btcpayserver:1.13.7",
"repoUrl": "https://github.com/btcpayserver/btcpayserver",
"requires": ["bitcoin-knots"]
"requires": [
"bitcoin-knots"
]
},
{
"id": "mempool", "title": "Mempool Explorer", "version": "3.0.0",
"id": "mempool",
"title": "Mempool Explorer",
"version": "3.0.0",
"description": "Self-hosted Bitcoin blockchain and mempool visualizer.",
"icon": "/assets/img/app-icons/mempool.webp",
"author": "Mempool", "category": "money", "tier": "core",
"dockerImage": "git.tx1138.com/lfg2025/mempool-frontend:v3.0.0",
"author": "Mempool",
"category": "money",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0",
"repoUrl": "https://github.com/mempool/mempool",
"requires": ["bitcoin-knots", "electrumx"]
"requires": [
"bitcoin-knots",
"electrumx"
]
},
{
"id": "electrumx", "title": "ElectrumX", "version": "1.18.0",
"id": "electrumx",
"title": "ElectrumX",
"version": "1.18.0",
"description": "Electrum protocol server. Index the blockchain for fast wallet lookups.",
"icon": "/assets/img/app-icons/electrumx.webp",
"author": "Luke Childs", "category": "money", "tier": "core",
"dockerImage": "git.tx1138.com/lfg2025/electrumx:v1.18.0",
"author": "Luke Childs",
"category": "money",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/electrumx:v1.18.0",
"repoUrl": "https://github.com/spesmilo/electrumx",
"requires": ["bitcoin-knots"]
"requires": [
"bitcoin-knots"
]
},
{
"id": "indeedhub", "title": "IndeeHub", "version": "1.0.0",
"id": "indeedhub",
"title": "IndeeHub",
"version": "1.0.0",
"description": "Bitcoin documentary streaming with Nostr identity.",
"icon": "/assets/img/app-icons/indeedhub.png",
"author": "IndeeHub", "category": "community",
"dockerImage": "git.tx1138.com/lfg2025/indeedhub:1.0.0",
"author": "IndeeHub",
"category": "community",
"dockerImage": "146.59.87.168:3000/lfg2025/indeedhub:1.0.0",
"repoUrl": "https://github.com/indeedhub/indeedhub"
},
{
"id": "botfights", "title": "BotFights", "version": "1.1.0",
"id": "botfights",
"title": "BotFights",
"version": "1.1.0",
"description": "Bot arena + 2-player arcade fighter with controller support and Adventure Mode.",
"icon": "/assets/img/app-icons/botfights.svg",
"author": "BotFights", "category": "community",
"dockerImage": "git.tx1138.com/lfg2025/botfights:1.1.0",
"author": "BotFights",
"category": "community",
"dockerImage": "146.59.87.168:3000/lfg2025/botfights:1.1.0",
"repoUrl": "https://botfights.net"
},
{
"id": "gitea", "title": "Gitea", "version": "1.23",
"id": "gitea",
"title": "Gitea",
"version": "1.23",
"description": "Self-hosted Git service with container registry, CI/CD, issue tracking.",
"icon": "/assets/img/app-icons/gitea.svg",
"author": "Gitea", "category": "development",
"dockerImage": "docker.io/gitea/gitea:1.23",
"author": "Gitea",
"category": "development",
"dockerImage": "146.59.87.168:3000/lfg2025/gitea:1.23",
"repoUrl": "https://gitea.com"
},
{
"id": "filebrowser", "title": "File Browser", "version": "2.27.0",
"id": "filebrowser",
"title": "File Browser",
"version": "2.27.0",
"description": "Web-based file manager.",
"icon": "/assets/img/app-icons/file-browser.webp",
"author": "File Browser", "category": "data", "tier": "core",
"dockerImage": "git.tx1138.com/lfg2025/filebrowser:v2.27.0",
"author": "File Browser",
"category": "data",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/filebrowser:v2.27.0",
"repoUrl": "https://github.com/filebrowser/filebrowser"
},
{
"id": "vaultwarden", "title": "Vaultwarden", "version": "1.30.0",
"id": "vaultwarden",
"title": "Vaultwarden",
"version": "1.30.0",
"description": "Self-hosted password vault with zero-knowledge encryption.",
"icon": "/assets/img/app-icons/vaultwarden.webp",
"author": "Vaultwarden", "category": "data", "tier": "recommended",
"dockerImage": "git.tx1138.com/lfg2025/vaultwarden:1.30.0-alpine",
"author": "Vaultwarden",
"category": "data",
"tier": "recommended",
"dockerImage": "146.59.87.168:3000/lfg2025/vaultwarden:1.30.0-alpine",
"repoUrl": "https://github.com/dani-garcia/vaultwarden"
},
{
"id": "searxng", "title": "SearXNG", "version": "2024.1.0",
"id": "searxng",
"title": "SearXNG",
"version": "2024.1.0",
"description": "Privacy-respecting metasearch engine.",
"icon": "/assets/img/app-icons/searxng.png",
"author": "SearXNG", "category": "data", "tier": "recommended",
"dockerImage": "git.tx1138.com/lfg2025/searxng:latest",
"author": "SearXNG",
"category": "data",
"tier": "recommended",
"dockerImage": "146.59.87.168:3000/lfg2025/searxng:latest",
"repoUrl": "https://github.com/searxng/searxng"
},
{
"id": "fedimint", "title": "Fedimint", "version": "0.10.0",
"id": "fedimint",
"title": "Fedimint",
"version": "0.10.0",
"description": "Federated Bitcoin mint with privacy through federated guardians.",
"icon": "/assets/img/app-icons/fedimint.png",
"author": "Fedimint", "category": "money",
"dockerImage": "git.tx1138.com/lfg2025/fedimintd:v0.10.0",
"author": "Fedimint",
"category": "money",
"dockerImage": "146.59.87.168:3000/lfg2025/fedimintd:v0.10.0",
"repoUrl": "https://github.com/fedimint/fedimint"
},
{
"id": "ollama", "title": "Ollama", "version": "0.5.4",
"description": "Run AI models locally. Private and on your hardware.",
"icon": "/assets/img/app-icons/ollama.png",
"author": "Ollama", "category": "data",
"dockerImage": "git.tx1138.com/lfg2025/ollama:latest",
"repoUrl": "https://github.com/ollama/ollama"
},
{
"id": "nextcloud", "title": "Nextcloud", "version": "28",
"description": "Your own private cloud. File sync, calendars, contacts.",
"icon": "/assets/img/app-icons/nextcloud.webp",
"author": "Nextcloud", "category": "data",
"dockerImage": "git.tx1138.com/lfg2025/nextcloud:28",
"repoUrl": "https://github.com/nextcloud/server"
},
{
"id": "jellyfin", "title": "Jellyfin", "version": "10.8.13",
"id": "jellyfin",
"title": "Jellyfin",
"version": "10.8.13",
"description": "Free media server. Stream movies, music, and photos.",
"icon": "/assets/img/app-icons/jellyfin.webp",
"author": "Jellyfin", "category": "data",
"dockerImage": "git.tx1138.com/lfg2025/jellyfin:10.8.13",
"author": "Jellyfin",
"category": "data",
"dockerImage": "146.59.87.168:3000/lfg2025/jellyfin:10.8.13",
"repoUrl": "https://github.com/jellyfin/jellyfin"
},
{
"id": "immich", "title": "Immich", "version": "1.90.0",
"id": "immich",
"title": "Immich",
"version": "1.90.0",
"description": "High-performance photo and video backup with ML.",
"icon": "/assets/img/app-icons/immich.png",
"author": "Immich", "category": "data",
"dockerImage": "git.tx1138.com/lfg2025/immich-server:release",
"author": "Immich",
"category": "data",
"dockerImage": "146.59.87.168:3000/lfg2025/immich-server:release",
"repoUrl": "https://github.com/immich-app/immich"
},
{
"id": "homeassistant", "title": "Home Assistant", "version": "2024.1",
"id": "homeassistant",
"title": "Home Assistant",
"version": "2024.1",
"description": "Open-source home automation.",
"icon": "/assets/img/app-icons/homeassistant.png",
"author": "Home Assistant", "category": "home",
"dockerImage": "git.tx1138.com/lfg2025/home-assistant:2024.1",
"author": "Home Assistant",
"category": "home",
"dockerImage": "146.59.87.168:3000/lfg2025/home-assistant:2024.1",
"repoUrl": "https://github.com/home-assistant/core"
},
{
"id": "grafana", "title": "Grafana", "version": "10.2.0",
"id": "grafana",
"title": "Grafana",
"version": "10.2.0",
"description": "Analytics and monitoring dashboards.",
"icon": "/assets/img/app-icons/grafana.png",
"author": "Grafana Labs", "category": "data", "tier": "recommended",
"dockerImage": "git.tx1138.com/lfg2025/grafana:10.2.0",
"author": "Grafana Labs",
"category": "data",
"tier": "recommended",
"dockerImage": "146.59.87.168:3000/lfg2025/grafana:10.2.0",
"repoUrl": "https://github.com/grafana/grafana"
},
{
"id": "tailscale", "title": "Tailscale", "version": "1.78.0",
"id": "tailscale",
"title": "Tailscale",
"version": "1.78.0",
"description": "Zero-config VPN with WireGuard mesh networking.",
"icon": "/assets/img/app-icons/tailscale.webp",
"author": "Tailscale", "category": "networking", "tier": "recommended",
"dockerImage": "git.tx1138.com/lfg2025/tailscale:stable",
"author": "Tailscale",
"category": "networking",
"tier": "recommended",
"dockerImage": "146.59.87.168:3000/lfg2025/tailscale:stable",
"repoUrl": "https://github.com/tailscale/tailscale"
},
{
"id": "uptime-kuma", "title": "Uptime Kuma", "version": "1.23.0",
"id": "uptime-kuma",
"title": "Uptime Kuma",
"version": "1.23.0",
"description": "Self-hosted uptime monitoring.",
"icon": "/assets/img/app-icons/uptime-kuma.webp",
"author": "Uptime Kuma", "category": "data", "tier": "recommended",
"dockerImage": "git.tx1138.com/lfg2025/uptime-kuma:1",
"author": "Uptime Kuma",
"category": "data",
"tier": "recommended",
"dockerImage": "146.59.87.168:3000/lfg2025/uptime-kuma:1",
"repoUrl": "https://github.com/louislam/uptime-kuma"
},
{
"id": "dwn", "title": "Decentralized Web Node", "version": "0.4.0",
"description": "Own your data with DID-based access control.",
"icon": "/assets/img/app-icons/dwn.svg",
"author": "TBD", "category": "data",
"dockerImage": "git.tx1138.com/lfg2025/dwn-server:main",
"repoUrl": "https://github.com/TBD54566975/dwn-server"
},
{
"id": "endurain", "title": "Endurain", "version": "0.8.0",
"description": "Self-hosted fitness tracking. Strava alternative.",
"icon": "/assets/img/app-icons/endurain.png",
"author": "Endurain", "category": "data",
"dockerImage": "git.tx1138.com/lfg2025/endurain:0.8.0",
"repoUrl": "https://github.com/joaovitoriasilva/endurain"
},
{
"id": "photoprism", "title": "PhotoPrism", "version": "240915",
"id": "photoprism",
"title": "PhotoPrism",
"version": "240915",
"description": "AI-powered photo management with facial recognition.",
"icon": "/assets/img/app-icons/photoprism.svg",
"author": "PhotoPrism", "category": "data",
"dockerImage": "git.tx1138.com/lfg2025/photoprism:240915",
"author": "PhotoPrism",
"category": "data",
"dockerImage": "146.59.87.168:3000/lfg2025/photoprism:240915",
"repoUrl": "https://github.com/photoprism/photoprism"
},
{
"id": "nextcloud",
"title": "Nextcloud",
"version": "28",
"description": "Your own private cloud. File sync, calendars, contacts.",
"icon": "/assets/img/app-icons/nextcloud.webp",
"author": "Nextcloud",
"category": "data",
"dockerImage": "146.59.87.168:3000/lfg2025/nextcloud:28",
"repoUrl": "https://github.com/nextcloud/server"
}
]
}

View File

@@ -0,0 +1,49 @@
app:
id: archy-btcpay-db
name: BTCPay Postgres
version: 15.17
description: Postgres backend for BTCPay and NBXplorer.
container:
image: git.tx1138.com/lfg2025/postgres:15.17
pull_policy: if-not-present
network: archy-net
data_uid: "100998:100998"
secret_env:
- key: POSTGRES_PASSWORD
secret_file: btcpay-db-password
dependencies:
- storage: 20Gi
resources:
memory_limit: 1Gi
disk_limit: 20Gi
security:
capabilities: [CHOWN, FOWNER, SETUID, SETGID, DAC_OVERRIDE]
readonly_root: false
network_policy: isolated
ports: []
volumes:
- type: bind
source: /var/lib/archipelago/postgres-btcpay
target: /var/lib/postgresql/data
options: [rw]
environment:
- POSTGRES_DB=btcpay
- POSTGRES_USER=btcpay
health_check:
type: tcp
endpoint: localhost:5432
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: none
sync_required: false

View File

@@ -0,0 +1,51 @@
app:
id: archy-mempool-db
name: Mempool MariaDB
version: 11.4.10
description: MariaDB backend for the mempool explorer stack.
container:
image: git.tx1138.com/lfg2025/mariadb:11.4.10
pull_policy: if-not-present
network: archy-net
data_uid: "100998:100998"
secret_env:
- key: MYSQL_PASSWORD
secret_file: mempool-db-password
- key: MYSQL_ROOT_PASSWORD
secret_file: mysql-root-db-password
dependencies:
- storage: 20Gi
resources:
memory_limit: 512Mi
disk_limit: 20Gi
security:
capabilities: [CHOWN, FOWNER, SETUID, SETGID, DAC_OVERRIDE]
readonly_root: false
network_policy: isolated
ports: []
volumes:
- type: bind
source: /var/lib/archipelago/mysql-mempool
target: /var/lib/mysql
options: [rw]
environment:
- MYSQL_DATABASE=mempool
- MYSQL_USER=mempool
health_check:
type: tcp
endpoint: localhost:3306
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: none
sync_required: false

View File

@@ -0,0 +1,50 @@
app:
id: archy-mempool-web
name: Mempool Web
version: 3.0.0
description: Frontend web UI for mempool explorer.
container_name: mempool
container:
image: git.tx1138.com/lfg2025/mempool-frontend:v3.0.0
pull_policy: if-not-present
network: archy-net
dependencies:
- app_id: mempool-api
version: ">=3.0.0"
resources:
memory_limit: 512Mi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports:
- host: 4080
container: 8080
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/mempool/nginx.conf
target: /etc/nginx/conf.d/default.conf
options: [ro]
environment:
- FRONTEND_HTTP_PORT=8080
- BACKEND_MAINNET_HTTP_HOST=mempool-api
health_check:
type: http
endpoint: http://localhost:8080
path: /
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: none
sync_required: false

View File

@@ -0,0 +1,62 @@
app:
id: archy-nbxplorer
name: NBXplorer
version: 2.6.0
description: BTCPay blockchain indexer service.
container:
image: git.tx1138.com/lfg2025/nbxplorer:2.6.0
pull_policy: if-not-present
network: archy-net
secret_env:
- key: NBXPLORER_BTCRPCPASSWORD
secret_file: bitcoin-rpc-password
- key: BTCPAY_DB_PASS
secret_file: btcpay-db-password
dependencies:
- app_id: bitcoin-core
version: ">=26.0"
- app_id: archy-btcpay-db
version: ">=15.17"
resources:
memory_limit: 2Gi
disk_limit: 20Gi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports:
- host: 32838
container: 32838
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/nbxplorer
target: /data
options: [rw]
environment:
- NBXPLORER_DATADIR=/data
- NBXPLORER_NETWORK=mainnet
- NBXPLORER_CHAINS=btc
- NBXPLORER_BIND=0.0.0.0:32838
- NBXPLORER_BTCRPCURL=http://bitcoin-knots:8332
- NBXPLORER_BTCRPCUSER=archipelago
- NBXPLORER_POSTGRES=User ID=btcpay;Password=${BTCPAY_DB_PASS};Host=archy-btcpay-db;Port=5432;Database=nbxplorer;Include Error Detail=true
health_check:
type: http
endpoint: http://localhost:32838
path: /
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: read-only
sync_required: true

View File

@@ -1,61 +1,75 @@
app:
id: bitcoin-core
name: Bitcoin Core
name: Bitcoin Knots
version: 28.4.0
description: Full Bitcoin node implementation. The reference implementation of the Bitcoin protocol.
description: Full Bitcoin Knots node with dynamic prune/full-mode startup based on host disk.
container_name: bitcoin-knots
container:
image: bitcoin/bitcoin:28.4
image_signature: cosign://...
pull_policy: verify-signature
image: 146.59.87.168:3000/lfg2025/bitcoin-knots:latest
pull_policy: if-not-present
network: archy-net
entrypoint: ["sh", "-lc"]
custom_args:
# Sync-speed flags: -par=0 uses every core (was capped at 2 by
# --cpus=2, now removed for bitcoin/electrumx). -dbcache sized to
# the IBD sweet spot — 4GB on full nodes, 1GB on pruned. Container
# --memory=8g (config.rs::get_memory_limit) leaves headroom for
# mempool + connections.
- >-
if [ "${DISK_GB:-0}" -lt 1000 ]; then
exec bitcoind -server=1 -prune=550 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=1024 -par=0 -maxconnections=125 -rpcuser="${BITCOIN_RPC_USER}" -rpcpassword="${BITCOIN_RPC_PASS}";
else
exec bitcoind -server=1 -txindex=1 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=4096 -par=0 -maxconnections=125 -rpcuser="${BITCOIN_RPC_USER}" -rpcpassword="${BITCOIN_RPC_PASS}";
fi
derived_env:
- key: DISK_GB
template: "{{DISK_GB}}"
secret_env:
- key: BITCOIN_RPC_PASS
secret_file: bitcoin-rpc-password
data_uid: "100101:100101"
dependencies:
- storage: 500Gi # Minimum disk space for mainnet
- storage: 500Gi
resources:
cpu_limit: 0 # 0 = unlimited; bitcoind uses -par=auto across all cores
memory_limit: 4Gi # matches container-specs.sh bitcoin-knots large-disk dbcache=4096
cpu_limit: 0
memory_limit: 4Gi
disk_limit: 500Gi
security:
capabilities: [] # No special capabilities needed
readonly_root: true
no_new_privileges: true
user: 1000
seccomp_profile: default
capabilities: [CHOWN, FOWNER, SETUID, SETGID, DAC_OVERRIDE]
readonly_root: false
network_policy: isolated
apparmor_profile: bitcoin-core
ports:
- host: 8332
container: 8332
protocol: tcp # RPC
protocol: tcp
- host: 8333
container: 8333
protocol: tcp # P2P
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/bitcoin
target: /home/bitcoin/.bitcoin
options: [rw]
environment:
- NETWORK=mainnet
- RPC_USER=${BITCOIN_RPC_USER}
- RPC_PASSWORD=${BITCOIN_RPC_PASSWORD}
- PRUNE=0 # Full node (set to 550 for pruned)
- BITCOIN_RPC_USER=archipelago
health_check:
type: http
endpoint: http://localhost:8332
path: /
type: tcp
endpoint: localhost:8332
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: admin
sync_required: true
testnet_support: true
testnet_support: false
pruning_support: true

View File

@@ -0,0 +1,56 @@
app:
id: bitcoin-ui
name: Bitcoin UI
version: 1.0.0
description: |
Archipelago-native HTTP proxy + static site for interacting with the
Bitcoin Core / Bitcoin Knots JSON-RPC. Runs nginx inside a container
and reverse-proxies /bitcoin-rpc/ to 127.0.0.1:8332 on the host. The
upstream Authorization header is substituted from
/var/lib/archipelago/secrets/bitcoin-rpc-password by the prod
orchestrator's pre-start hook, rendered into an nginx.conf that is
bind-mounted read-only at container start.
container:
build:
context: /opt/archipelago/docker/bitcoin-ui
dockerfile: Dockerfile
tag: localhost/bitcoin-ui:local
dependencies:
- app_id: bitcoin-core
resources:
memory_limit: 128Mi
security:
readonly_root: false
network_policy: host
# Host networking: nginx listens on 8334 directly on the host IP, and
# proxies to 127.0.0.1:8332 which is where the bitcoin backend binds
# its RPC. `ports:` is intentionally empty because host networking
# bypasses port mapping.
ports: []
volumes:
# Bind-mount the rendered nginx.conf read-only. The prod orchestrator
# renders /var/lib/archipelago/bitcoin-ui/nginx.conf on every install
# and every reconcile pass, substituting the base64 RPC auth from
# the plaintext password secret. If the rendered bytes change (the
# password rotated, or the template was updated by OTA), the
# reconciler restarts this container so nginx re-reads the config.
- type: bind
source: /var/lib/archipelago/bitcoin-ui/nginx.conf
target: /etc/nginx/conf.d/default.conf
options: [ro]
environment: []
health_check:
type: http
endpoint: http://127.0.0.1:8334
path: /
interval: 30s
timeout: 5s
retries: 3

View File

@@ -1,66 +1,70 @@
app:
id: btcpay-server
name: BTCPay Server
version: 1.12.0
version: 1.13.7
description: Self-hosted Bitcoin payment processor. Accept Bitcoin payments without intermediaries.
container:
image: btcpayserver/btcpayserver:1.12.0
image_signature: cosign://...
pull_policy: verify-signature
image: git.tx1138.com/lfg2025/btcpayserver:1.13.7
pull_policy: if-not-present
network: archy-net
secret_env:
- key: BTCPAY_BTCRPCPASSWORD
secret_file: bitcoin-rpc-password
- key: BTCPAY_DB_PASS
secret_file: btcpay-db-password
dependencies:
- app_id: bitcoin-core
version: ">=26.0"
- app_id: lnd
version: ">=0.18.0"
- app_id: archy-btcpay-db
version: ">=15.17"
- app_id: archy-nbxplorer
version: ">=2.6.0"
resources:
cpu_limit: 2
memory_limit: 2Gi
disk_limit: 20Gi
security:
capabilities: [NET_BIND_SERVICE]
readonly_root: true
no_new_privileges: true
user: 1000
seccomp_profile: default
capabilities: []
readonly_root: false
network_policy: isolated
apparmor_profile: btcpay
ports:
- host: 80
container: 80
- host: 23000
container: 49392
protocol: tcp
- host: 443
container: 443
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/btcpay
target: /datadir
options: [rw]
environment:
- BTCPAY_NETWORK=mainnet
- BTCPAY_CHAIN=btc
- BTCPAY_BTCEXPLORERURL=http://bitcoin-core:8332
- BTCPAY_LIGHTNING=type=lnd-rest;server=http://lnd:8080;allowinsecure=true
- ASPNETCORE_URLS=http://0.0.0.0:49392
- BTCPAY_PROTOCOL=http
- BTCPAY_HOST=127.0.0.1:23000
- BTCPAY_CHAINS=btc
- BTCPAY_BTCEXPLORERURL=http://archy-nbxplorer:32838
- BTCPAY_BTCRPCURL=http://bitcoin-knots:8332
- BTCPAY_BTCRPCUSER=archipelago
- BTCPAY_POSTGRES=User ID=btcpay;Password=${BTCPAY_DB_PASS};Host=archy-btcpay-db;Port=5432;Database=btcpay;Include Error Detail=true
health_check:
type: http
endpoint: http://localhost
path: /health
endpoint: http://localhost:49392
path: /
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: read-only
sync_required: true
lightning_integration:
payment_processing: true
payment_processing: false
invoice_management: true

View File

@@ -0,0 +1,38 @@
app:
id: electrs-ui
name: Electrs UI
version: 1.0.0
description: |
Archipelago-native HTTP frontend for electrs/electrumx status. Runs
nginx inside a container, serves static assets, and proxies
/electrs-status to the archipelago backend on 127.0.0.1:5678.
container:
build:
context: /opt/archipelago/docker/electrs-ui
dockerfile: Dockerfile
tag: localhost/electrs-ui:local
dependencies: []
resources:
memory_limit: 64Mi
security:
readonly_root: false
network_policy: host
# Host networking: nginx listens on 50002 directly on the host IP.
ports: []
volumes: []
environment: []
health_check:
type: http
endpoint: http://127.0.0.1:50002
path: /
interval: 30s
timeout: 5s
retries: 3

View File

@@ -0,0 +1,60 @@
app:
id: electrumx
name: ElectrumX
version: 1.18.0
description: Electrum server indexing Bitcoin chain data for lightweight wallet queries.
container:
image: git.tx1138.com/lfg2025/electrumx:v1.18.0
pull_policy: if-not-present
network: archy-net
entrypoint: ["sh", "-lc"]
custom_args:
- >-
export DAEMON_URL="http://archipelago:${BITCOIN_RPC_PASS}@bitcoin-knots:8332/";
exec electrumx_server
secret_env:
- key: BITCOIN_RPC_PASS
secret_file: bitcoin-rpc-password
dependencies:
- app_id: bitcoin-core
version: ">=26.0"
- storage: 50Gi
resources:
cpu_limit: 2
memory_limit: 2Gi
disk_limit: 50Gi
security:
capabilities: [DAC_OVERRIDE]
readonly_root: false
network_policy: isolated
ports:
- host: 50001
container: 50001
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/electrumx
target: /data
options: [rw]
environment:
- COIN=Bitcoin
- DB_DIRECTORY=/data
- SERVICES=tcp://:50001,rpc://0.0.0.0:8000
health_check:
type: tcp
endpoint: localhost:50001
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: read-only
sync_required: true

View File

@@ -1,6 +0,0 @@
node_modules
dist
*.log
.git
.gitignore
README.md

View File

@@ -1,37 +0,0 @@
FROM node:20-alpine AS builder
WORKDIR /app
# Copy package files
COPY package*.json ./
RUN npm ci --only=production
# Copy source code
COPY . .
# Build the application
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
# Copy built application
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
# Create non-root user
RUN addgroup -g 1000 appuser && \
adduser -D -u 1000 -G appuser appuser && \
mkdir -p /app/data && \
chown -R appuser:appuser /app
USER appuser
EXPOSE 8080
ENV ENDURAIN_DATA_DIR=/app/data
CMD ["node", "dist/index.js"]

View File

@@ -1,50 +0,0 @@
app:
id: endurain
name: Endurain
version: 1.0.0
description: Endurain application platform. Custom application runtime.
container:
image: archipelago/endurain:1.0.0
image_signature: cosign://...
pull_policy: if-not-present
dependencies:
- storage: 2Gi
resources:
cpu_limit: 2
memory_limit: 1Gi
disk_limit: 2Gi
security:
capabilities: []
readonly_root: true
no_new_privileges: true
user: 1000
seccomp_profile: default
network_policy: isolated
apparmor_profile: endurain
ports:
- host: 8085
container: 8080
protocol: tcp # Web UI
volumes:
- type: bind
source: /var/lib/archipelago/endurain
target: /app/data
options: [rw]
environment:
- ENDURAIN_ENV=production
- ENDURAIN_DATA_DIR=/app/data
health_check:
type: http
endpoint: http://localhost:8085
path: /health
interval: 30s
timeout: 5s
retries: 3

File diff suppressed because it is too large Load Diff

View File

@@ -1,20 +0,0 @@
{
"name": "endurain",
"version": "1.0.0",
"description": "Endurain application platform",
"main": "dist/index.js",
"scripts": {
"build": "tsc",
"start": "node dist/index.js",
"dev": "ts-node src/index.ts"
},
"dependencies": {
"express": "^4.18.2"
},
"devDependencies": {
"@types/express": "^4.17.21",
"@types/node": "^20.10.0",
"typescript": "^5.3.3",
"ts-node": "^10.9.2"
}
}

View File

@@ -1,27 +0,0 @@
import express from 'express';
const app = express();
const port = 8080;
// Middleware
app.use(express.json());
// Health check endpoint
app.get('/health', (req, res) => {
res.json({ status: 'ok', service: 'endurain', version: '1.0.0' });
});
// API endpoints
app.get('/api/info', (req, res) => {
res.json({
name: 'Endurain',
version: '1.0.0',
status: 'running'
});
});
// Start server
app.listen(port, '0.0.0.0', () => {
console.log(`Endurain listening on port ${port}`);
console.log(`Data directory: ${process.env.ENDURAIN_DATA_DIR || '/app/data'}`);
});

View File

@@ -1,16 +0,0 @@
{
"compilerOptions": {
"target": "ES2020",
"module": "commonjs",
"lib": ["ES2020"],
"outDir": "./dist",
"rootDir": "./src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist"]
}

View File

@@ -0,0 +1,73 @@
app:
id: fedimint-gateway
name: Fedimint Gateway
version: 0.10.0
description: Fedimint gateway service with automatic LND-or-LDK backend selection.
container:
image: git.tx1138.com/lfg2025/gatewayd:v0.10.0
pull_policy: if-not-present
network: archy-net
entrypoint: ["sh", "-lc"]
custom_args:
- >-
if [ -f /lnd/tls.cert ] && [ -f /lnd/data/chain/bitcoin/mainnet/admin.macaroon ]; then
exec gatewayd --data-dir /data --listen 0.0.0.0:8176 --bcrypt-password-hash "$FEDI_HASH" --network bitcoin --bitcoind-url http://bitcoin-knots:8332 --bitcoind-username "$FM_BITCOIND_USERNAME" --bitcoind-password "$FM_BITCOIND_PASSWORD" lnd --lnd-rpc-host lnd:10009 --lnd-tls-cert /lnd/tls.cert --lnd-macaroon /lnd/data/chain/bitcoin/mainnet/admin.macaroon;
else
exec gatewayd --data-dir /data --listen 0.0.0.0:8176 --bcrypt-password-hash "$FEDI_HASH" --network bitcoin --bitcoind-url http://bitcoin-knots:8332 --bitcoind-username "$FM_BITCOIND_USERNAME" --bitcoind-password "$FM_BITCOIND_PASSWORD" ldk --ldk-lightning-port 9737 --ldk-alias archipelago-gateway;
fi
secret_env:
- key: FM_BITCOIND_PASSWORD
secret_file: bitcoin-rpc-password
- key: FEDI_HASH
secret_file: fedimint-gateway-hash
data_uid: "100000:100000"
dependencies:
- app_id: bitcoin-core
version: ">=26.0"
- app_id: fedimint
version: ">=0.10.0"
resources:
cpu_limit: 2
memory_limit: 2Gi
disk_limit: 10Gi
security:
capabilities: []
readonly_root: true
network_policy: isolated
ports:
- host: 8176
container: 8176
protocol: tcp
- host: 9737
container: 9737
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/fedimint-gateway
target: /data
options: [rw]
- type: bind
source: /var/lib/archipelago/lnd
target: /lnd
options: [ro]
environment:
- FM_BITCOIND_USERNAME=archipelago
health_check:
type: http
endpoint: http://localhost:8176
path: /
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: admin
sync_required: true

View File

@@ -3,56 +3,62 @@ app:
name: Fedimint
version: 0.10.0
description: Federated Bitcoin minting service with built-in Guardian UI. Privacy-preserving Bitcoin custody.
container:
image: fedimint/fedimintd:v0.10.0
image_signature: cosign://...
image: git.tx1138.com/lfg2025/fedimintd:v0.10.0
pull_policy: if-not-present
network: archy-net
derived_env:
- key: FM_P2P_URL
template: fedimint://{{HOST_MDNS}}:8173
- key: FM_API_URL
template: ws://{{HOST_MDNS}}:8174
secret_env:
- key: FM_BITCOIND_PASSWORD
secret_file: bitcoin-rpc-password
data_uid: "100000:100000"
dependencies:
- app_id: bitcoin-core
version: ">=24.0"
version: ">=26.0"
- storage: 20Gi
resources:
cpu_limit: 4
memory_limit: 4Gi
disk_limit: 20Gi
security:
capabilities: []
readonly_root: true
no_new_privileges: true
user: 1000
seccomp_profile: default
network_policy: isolated
apparmor_profile: fedimint
ports:
- host: 8173
container: 8173
protocol: tcp # P2P
protocol: tcp
- host: 8174
container: 8174
protocol: tcp # API
protocol: tcp
- host: 8175
container: 8175
protocol: tcp # Built-in Guardian UI
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/fedimint
target: /fedimint
target: /data
options: [rw]
environment:
- FM_DATA_DIR=/fedimint
- FM_BITCOIND_URL=http://bitcoin-core:8332
- FM_BITCOIND_USERNAME=${BITCOIN_RPC_USER}
- FM_BITCOIND_PASSWORD=${BITCOIN_RPC_PASSWORD}
- FM_DATA_DIR=/data
- FM_BITCOIND_URL=http://bitcoin-knots:8332
- FM_BITCOIND_USERNAME=archipelago
- FM_BITCOIN_NETWORK=bitcoin
- FM_BIND_P2P=0.0.0.0:8173
- FM_BIND_API=0.0.0.0:8174
- FM_BIND_UI=0.0.0.0:8175
health_check:
type: http
endpoint: http://localhost:8175
@@ -60,7 +66,7 @@ app:
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: admin
sync_required: true

View File

@@ -0,0 +1,53 @@
app:
id: filebrowser
name: File Browser
version: 2.27.0
description: Baseline Archipelago file manager service.
container:
image: git.tx1138.com/lfg2025/filebrowser:v2.27.0
pull_policy: if-not-present
network: archy-net
custom_args: ["--config", "/data/.filebrowser.json"]
data_uid: "100000:100000"
dependencies:
- storage: 10Gi
resources:
memory_limit: 256Mi
disk_limit: 10Gi
security:
capabilities: [CHOWN, FOWNER, SETUID, SETGID, DAC_OVERRIDE, NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
ports:
- host: 8083
container: 80
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/filebrowser
target: /srv
options: [rw]
- type: bind
source: /var/lib/archipelago/filebrowser-data
target: /data
options: [rw]
environment: []
health_check:
type: http
endpoint: http://localhost:80
path: /health
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: none
sync_required: false

View File

@@ -6,7 +6,7 @@ app:
category: media
container:
image: git.tx1138.com/lfg2025/indeedhub:latest
image: 146.59.87.168:3000/lfg2025/indeedhub:latest
pull_policy: always # Pull from registry; falls back to local build
dependencies:

44
apps/lnd-ui/manifest.yml Normal file
View File

@@ -0,0 +1,44 @@
app:
id: lnd-ui
name: LND UI
version: 1.0.0
description: |
Archipelago-native HTTP frontend for LND. Runs nginx inside a
container and serves static assets. LND connection info is fetched
via an absolute URL that the host nginx routes to the archipelago
backend on 127.0.0.1:5678, so no upstream auth is baked in.
container:
build:
context: /opt/archipelago/docker/lnd-ui
dockerfile: Dockerfile
tag: localhost/lnd-ui:local
dependencies:
- app_id: lnd
resources:
memory_limit: 64Mi
security:
readonly_root: false
network_policy: bridge
# Bridge networking via archy-net. Container nginx listens on 80;
# host nginx proxies /app/lnd/ -> 127.0.0.1:8081 -> container:80.
ports:
- host: 8081
container: 80
protocol: tcp
volumes: []
environment: []
health_check:
type: http
endpoint: http://127.0.0.1:8081
path: /
interval: 30s
timeout: 5s
retries: 3

View File

@@ -1,67 +1,65 @@
app:
id: lnd
name: Lightning Network Daemon
version: 0.18.0
version: 0.18.4
description: Lightning Network implementation by Lightning Labs. Enables instant, low-cost Bitcoin payments.
container:
image: lightninglabs/lnd:v0.18.0
image_signature: cosign://...
pull_policy: verify-signature
image: git.tx1138.com/lfg2025/lnd:v0.18.4-beta
pull_policy: if-not-present
network: archy-net
secret_env:
- key: BITCOIND_RPCPASS
secret_file: bitcoin-rpc-password
data_uid: "100000:100000"
dependencies:
- app_id: bitcoin-core
version: ">=26.0"
resources:
cpu_limit: 2
memory_limit: 1Gi
disk_limit: 10Gi
security:
capabilities: [NET_BIND_SERVICE]
readonly_root: true
no_new_privileges: true
user: 1000
seccomp_profile: default
capabilities: [CHOWN, FOWNER, SETUID, SETGID, DAC_OVERRIDE, NET_RAW]
readonly_root: false
network_policy: isolated
apparmor_profile: lnd
ports:
- host: 9735
container: 9735
protocol: tcp # P2P
protocol: tcp
- host: 10009
container: 10009
protocol: tcp # gRPC
protocol: tcp
- host: 8080
container: 8080
protocol: tcp # REST
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/lnd
target: /root/.lnd
options: [rw]
environment:
- BITCOIND_HOST=bitcoin-core
- BITCOIND_RPCUSER=${BITCOIN_RPC_USER}
- BITCOIND_RPCPASS=${BITCOIN_RPC_PASSWORD}
- BITCOIND_HOST=bitcoin-knots
- BITCOIND_RPCUSER=archipelago
- NETWORK=mainnet
health_check:
type: http
endpoint: http://localhost:8080
path: /v1/getinfo
type: tcp
endpoint: localhost:10009
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: admin
sync_required: true
lightning_integration:
channel_management: true
payment_routing: true

View File

@@ -0,0 +1,68 @@
app:
id: mempool-api
name: Mempool API
version: 3.0.0
description: Backend API for mempool explorer.
container:
image: git.tx1138.com/lfg2025/mempool-backend:v3.0.0
pull_policy: if-not-present
network: archy-net
secret_env:
- key: CORE_RPC_PASSWORD
secret_file: bitcoin-rpc-password
- key: DATABASE_PASSWORD
secret_file: mempool-db-password
dependencies:
- app_id: bitcoin-core
version: ">=26.0"
- app_id: electrumx
version: ">=1.18.0"
- app_id: archy-mempool-db
version: ">=11.4.10"
resources:
memory_limit: 2Gi
disk_limit: 20Gi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports:
- host: 8999
container: 8999
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/mempool
target: /data
options: [rw]
environment:
- MEMPOOL_BACKEND=electrum
- ELECTRUM_HOST=electrumx
- ELECTRUM_PORT=50001
- ELECTRUM_TLS_ENABLED=false
- CORE_RPC_HOST=bitcoin-knots
- CORE_RPC_PORT=8332
- CORE_RPC_USERNAME=archipelago
- DATABASE_ENABLED=true
- DATABASE_HOST=archy-mempool-db
- DATABASE_DATABASE=mempool
- DATABASE_USERNAME=mempool
health_check:
type: http
endpoint: http://localhost:8999
path: /
interval: 30s
timeout: 5s
retries: 3
bitcoin_integration:
rpc_access: read-only
sync_required: true

View File

@@ -1,5 +0,0 @@
# Ollama - uses official image
FROM ollama/ollama:latest
# Default configuration is in the image
# No additional setup needed

View File

@@ -1,50 +0,0 @@
app:
id: ollama
name: Ollama
version: 0.1.0
description: Run large language models locally. Privacy-preserving AI on your node.
container:
image: ollama/ollama:0.6.2
image_signature: cosign://...
pull_policy: if-not-present
dependencies:
- storage: 50Gi # Models can be large
resources:
cpu_limit: 4
memory_limit: 8Gi # LLMs need lots of RAM
disk_limit: 50Gi
security:
capabilities: []
readonly_root: false # Ollama needs write access for models
no_new_privileges: true
user: 1000
seccomp_profile: default
network_policy: isolated
apparmor_profile: ollama
ports:
- host: 11434
container: 11434
protocol: tcp # API
volumes:
- type: bind
source: /var/lib/archipelago/ollama
target: /root/.ollama
options: [rw]
environment:
- OLLAMA_HOST=0.0.0.0:11434
- OLLAMA_KEEP_ALIVE=24h
health_check:
type: http
endpoint: http://localhost:11434
path: /api/tags
interval: 30s
timeout: 10s
retries: 3

3
core/Cargo.lock generated
View File

@@ -80,13 +80,14 @@ checksum = "a23eb6b1614318a8071c9b2521f36b424b2c83db5eb3a0fead4a6c0809af6e61"
[[package]]
name = "archipelago"
version = "1.7.42-alpha"
version = "1.7.47-alpha"
dependencies = [
"anyhow",
"archipelago-container",
"archipelago-performance",
"archipelago-security",
"argon2",
"async-trait",
"base64 0.21.7",
"bcrypt",
"bip39",

View File

@@ -1,6 +1,6 @@
[package]
name = "archipelago"
version = "1.7.42-alpha"
version = "1.7.47-alpha"
edition = "2021"
description = "Archipelago Bitcoin Node OS - Native backend"
authors = ["Archipelago Team"]
@@ -103,6 +103,9 @@ mdns-sd = "0.18"
# Systemd watchdog notification
sd-notify = "0.4"
# Trait objects for async methods (container orchestrator trait, Step 4)
async-trait = "0.1"
[dev-dependencies]
tokio-test = "0.4"
tempfile = "3.10"

View File

@@ -10,6 +10,7 @@ mod websocket;
use crate::api::rpc::RpcHandler;
use crate::blobs::BlobStore;
use crate::config::Config;
use crate::container::{ContainerOrchestrator, DevContainerOrchestrator};
use crate::monitoring::MetricsStore;
use crate::session::{self, SessionStore};
use crate::state::StateManager;
@@ -54,6 +55,8 @@ impl ApiHandler {
config: Config,
state_manager: Arc<StateManager>,
metrics_store: Arc<MetricsStore>,
orchestrator: Option<Arc<dyn ContainerOrchestrator>>,
dev_orchestrator: Option<Arc<DevContainerOrchestrator>>,
) -> Result<Self> {
let session_store = SessionStore::new().await;
let rpc_handler = Arc::new(
@@ -62,6 +65,8 @@ impl ApiHandler {
state_manager.clone(),
metrics_store.clone(),
session_store.clone(),
orchestrator,
dev_orchestrator,
)
.await?,
);
@@ -125,8 +130,7 @@ impl ApiHandler {
/// persisted a registry config yet. 15s total timeout.
async fn handle_app_catalog_proxy(&self) -> Result<Response<hyper::Body>> {
let mut upstreams: Vec<String> = Vec::new();
if let Ok(config) =
crate::container::registry::load_registries(&self.config.data_dir).await
if let Ok(config) = crate::container::registry::load_registries(&self.config.data_dir).await
{
for reg in config.active_registries() {
let scheme = if reg.tls_verify { "https" } else { "http" };
@@ -141,7 +145,7 @@ impl ApiHandler {
}
if upstreams.is_empty() {
upstreams.push(
"http://23.182.128.160:3000/lfg2025/app-catalog/raw/branch/main/catalog.json"
"http://146.59.87.168:3000/lfg2025/app-catalog/raw/branch/main/catalog.json"
.to_string(),
);
upstreams.push(
@@ -316,7 +320,7 @@ impl ApiHandler {
match (method, path.as_str()) {
// RPC — auth is handled inside rpc handler per-method
(Method::POST, "/rpc/v1") => self.rpc_handler.handle(req_with_bytes).await,
(Method::POST, "/rpc/v1") => self.rpc_handler.clone().handle(req_with_bytes).await,
// Health — unauthenticated, returns JSON with service status
(Method::GET, "/health") => {

View File

@@ -408,9 +408,8 @@ async fn bitcoin_rpc_post_with_retry_cfg<T: serde::de::DeserializeOwned>(
.ok_or_else(|| anyhow::anyhow!("Bitcoin RPC returned null result"));
}
Err(last_err.unwrap_or_else(|| {
anyhow::anyhow!("Bitcoin RPC exhausted retries with no error captured")
}))
Err(last_err
.unwrap_or_else(|| anyhow::anyhow!("Bitcoin RPC exhausted retries with no error captured")))
}
#[cfg(test)]
@@ -428,7 +427,11 @@ mod tests {
/// oneshot cancel channel).
async fn spawn_mock<F, Fut>(
handler: F,
) -> (String, tokio::task::JoinHandle<()>, tokio::sync::oneshot::Sender<()>)
) -> (
String,
tokio::task::JoinHandle<()>,
tokio::sync::oneshot::Sender<()>,
)
where
F: Fn(Request<Body>) -> Fut + Send + Sync + Clone + 'static,
Fut: std::future::Future<Output = Response<Body>> + Send + 'static,
@@ -447,7 +450,9 @@ mod tests {
let url = format!("http://{}", server.local_addr());
let (tx, rx) = tokio::sync::oneshot::channel::<()>();
let handle = tokio::spawn(async move {
let graceful = server.with_graceful_shutdown(async { let _ = rx.await; });
let graceful = server.with_graceful_shutdown(async {
let _ = rx.await;
});
let _ = graceful.await;
});
(url, handle, tx)
@@ -477,16 +482,10 @@ mod tests {
.await;
let client = reqwest::Client::builder().build().unwrap();
let v: u64 = bitcoin_rpc_post_with_retry(
&client,
&url,
"user",
"pass",
"getblockcount",
&[],
)
.await
.expect("should succeed");
let v: u64 =
bitcoin_rpc_post_with_retry(&client, &url, "user", "pass", "getblockcount", &[])
.await
.expect("should succeed");
assert_eq!(v, 42);
assert_eq!(count.load(Ordering::SeqCst), 1, "should not have retried");
}
@@ -512,15 +511,8 @@ mod tests {
.await;
let client = reqwest::Client::builder().build().unwrap();
let result: Result<u64> = bitcoin_rpc_post_with_retry(
&client,
&url,
"user",
"pass",
"getblockcount",
&[],
)
.await;
let result: Result<u64> =
bitcoin_rpc_post_with_retry(&client, &url, "user", "pass", "getblockcount", &[]).await;
assert!(result.is_err(), "non-JSON response should error out");
assert_eq!(
count.load(Ordering::SeqCst),
@@ -544,15 +536,9 @@ mod tests {
.build()
.unwrap();
let start = std::time::Instant::now();
let result: Result<u64> = bitcoin_rpc_post_with_retry(
&client,
&closed_url,
"user",
"pass",
"getblockcount",
&[],
)
.await;
let result: Result<u64> =
bitcoin_rpc_post_with_retry(&client, &closed_url, "user", "pass", "getblockcount", &[])
.await;
let elapsed = start.elapsed();
assert!(result.is_err(), "connect-refused should exhaust retries");
let min_backoff: std::time::Duration = BITCOIN_RPC_BACKOFFS.iter().sum();
@@ -629,15 +615,8 @@ mod tests {
.await;
let client = reqwest::Client::builder().build().unwrap();
let result: Result<u64> = bitcoin_rpc_post_with_retry(
&client,
&url,
"user",
"pass",
"getblockcount",
&[],
)
.await;
let result: Result<u64> =
bitcoin_rpc_post_with_retry(&client, &url, "user", "pass", "getblockcount", &[]).await;
assert!(result.is_err());
assert_eq!(
count.load(Ordering::SeqCst),
@@ -652,12 +631,14 @@ mod tests {
#[test]
fn retry_budget_invariants() {
assert_eq!(BITCOIN_RPC_MAX_ATTEMPTS, 3);
assert_eq!(BITCOIN_RPC_BACKOFFS.len(), (BITCOIN_RPC_MAX_ATTEMPTS - 1) as usize);
assert_eq!(
BITCOIN_RPC_BACKOFFS.len(),
(BITCOIN_RPC_MAX_ATTEMPTS - 1) as usize
);
// Total wall-time ceiling:
// 3 attempts * 15s + (0.5s + 1.5s) backoff = 47s
let total: std::time::Duration =
BITCOIN_RPC_ATTEMPT_TIMEOUT * BITCOIN_RPC_MAX_ATTEMPTS
+ BITCOIN_RPC_BACKOFFS.iter().sum::<std::time::Duration>();
let total: std::time::Duration = BITCOIN_RPC_ATTEMPT_TIMEOUT * BITCOIN_RPC_MAX_ATTEMPTS
+ BITCOIN_RPC_BACKOFFS.iter().sum::<std::time::Duration>();
assert!(total < std::time::Duration::from_secs(60));
}
}

View File

@@ -1,4 +1,5 @@
use super::package::validate_app_id;
use super::transitional::Op;
use super::RpcHandler;
use anyhow::{Context, Result};
@@ -7,8 +8,13 @@ impl RpcHandler {
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
// The `container-install { manifest_path }` RPC is a dev-mode convenience
// that points at an arbitrary YAML on disk. Production install happens via
// the reconciler (BootReconciler, Step 5) and via the unified
// ContainerOrchestrator::install(app_id) trait call, which can be exposed
// through a separate `container-install-by-id` RPC when needed.
let dev = self.dev_orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("container-install with manifest_path is only available in dev mode")
})?;
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
@@ -45,7 +51,7 @@ impl RpcHandler {
let manifest: archipelago_container::AppManifest =
serde_yaml::from_str(&manifest_content).context("Failed to parse manifest")?;
let container_name = orchestrator
let container_name = dev
.install_container(&manifest, manifest_path)
.await
.context("Failed to install container")?;
@@ -57,10 +63,6 @@ impl RpcHandler {
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
})?;
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
.get("app_id")
@@ -68,22 +70,24 @@ impl RpcHandler {
.ok_or_else(|| anyhow::anyhow!("Missing app_id"))?;
validate_app_id(app_id)?;
orchestrator
.start_container(app_id)
.await
.context("Failed to start container")?;
// User explicitly started the app — clear the user-stopped marker so
// crash recovery / health monitor won't second-guess it. Must happen
// BEFORE the spawn (see runtime.rs:145-148 for the symmetric stop
// side and the ordering contract crash recovery depends on).
crate::crash_recovery::clear_user_stopped(&self.config.data_dir, app_id).await;
Ok(serde_json::json!({ "status": "started" }))
// spawn_transitional returns as soon as the background task is
// launched (<1s). The UI sees Starting… immediately via WebSocket.
self.spawn_transitional(Op::Start, app_id.to_string())
.await?;
Ok(serde_json::json!({ "status": "starting" }))
}
pub(super) async fn handle_container_stop(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
})?;
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
.get("app_id")
@@ -91,21 +95,51 @@ impl RpcHandler {
.ok_or_else(|| anyhow::anyhow!("Missing app_id"))?;
validate_app_id(app_id)?;
orchestrator
.stop_container(app_id)
.await
.context("Failed to stop container")?;
// Mark as user-stopped BEFORE the spawn — ordering is load-bearing
// (crash recovery / health monitor inspect this flag concurrently
// with the in-flight stop; see runtime.rs:145-148 for the package
// path that also writes this in the same order).
crate::crash_recovery::mark_user_stopped(&self.config.data_dir, app_id).await;
Ok(serde_json::json!({ "status": "stopped" }))
// podman stop -t 600 (bitcoin-core) / -t 330 (lnd) runs in the
// background; the RPC returns now with "stopping".
self.spawn_transitional(Op::Stop, app_id.to_string())
.await?;
Ok(serde_json::json!({ "status": "stopping" }))
}
pub(super) async fn handle_container_restart(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
.get("app_id")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing app_id"))?;
validate_app_id(app_id)?;
// Restart does not mark user-stopped (the user wants the app to
// keep running). Clear the marker as a defensive measure in case a
// prior stop left it set and the restart is intended to revive the
// normal running state.
crate::crash_recovery::clear_user_stopped(&self.config.data_dir, app_id).await;
self.spawn_transitional(Op::Restart, app_id.to_string())
.await?;
Ok(serde_json::json!({ "status": "restarting" }))
}
pub(super) async fn handle_container_remove(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
})?;
let orchestrator = self
.orchestrator
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Container orchestrator not available"))?;
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
@@ -119,7 +153,7 @@ impl RpcHandler {
.unwrap_or(false);
orchestrator
.remove_container(app_id, preserve_data)
.remove(app_id, preserve_data)
.await
.context("Failed to remove container")?;
@@ -137,12 +171,25 @@ impl RpcHandler {
.package_data
.iter()
.map(|(id, pkg)| {
// Keep this mapping in sync with the UI's
// ContainerStatus.state union in
// neode-ui/src/api/container-client.ts. The UI maps
// transitional variants to single-button labels
// (Stopping… / Starting… / Restarting…).
let state = match &pkg.state {
crate::data_model::PackageState::Running => "running",
crate::data_model::PackageState::Stopped => "stopped",
crate::data_model::PackageState::Exited => "exited",
crate::data_model::PackageState::Starting => "created",
_ => "unknown",
crate::data_model::PackageState::Starting => "starting",
crate::data_model::PackageState::Stopping => "stopping",
crate::data_model::PackageState::Restarting => "restarting",
crate::data_model::PackageState::Installing => "installing",
crate::data_model::PackageState::Installed => "installed",
crate::data_model::PackageState::Updating => "updating",
crate::data_model::PackageState::Removing => "removing",
crate::data_model::PackageState::CreatingBackup => "creating-backup",
crate::data_model::PackageState::RestoringBackup => "restoring-backup",
crate::data_model::PackageState::BackingUp => "backing-up",
};
let lan = pkg
.installed
@@ -163,9 +210,9 @@ impl RpcHandler {
return Ok(serde_json::json!(containers));
}
// Fallback: scanner hasn't run yet, query podman directly
// Fallback: scanner hasn't run yet, query the orchestrator directly.
if let Some(orchestrator) = &self.orchestrator {
if let Ok(containers) = orchestrator.list_containers().await {
if let Ok(containers) = orchestrator.list().await {
if !containers.is_empty() {
return Ok(serde_json::to_value(containers)?);
}
@@ -242,9 +289,10 @@ impl RpcHandler {
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
})?;
let orchestrator = self
.orchestrator
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Container orchestrator not available"))?;
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
@@ -253,21 +301,36 @@ impl RpcHandler {
.ok_or_else(|| anyhow::anyhow!("Missing app_id"))?;
validate_app_id(app_id)?;
let status = orchestrator
.get_container_status(app_id)
.await
.context("Failed to get container status")?;
let mut last_err: Option<anyhow::Error> = None;
for candidate in status_app_id_candidates(app_id) {
match orchestrator.status(&candidate).await {
Ok(status) => return Ok(serde_json::to_value(status)?),
Err(e) => last_err = Some(e),
}
}
Ok(serde_json::to_value(status)?)
// Fallback for alias drift: query podman directly by likely container
// names so status checks stay useful during migration.
for name in status_container_name_candidates(app_id) {
if let Some(v) = inspect_container_state_value(&name).await {
return Ok(v);
}
}
if let Some(e) = last_err {
return Err(e.context("Failed to get container status"));
}
Err(anyhow::anyhow!("Failed to get container status"))
}
pub(super) async fn handle_container_logs(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
})?;
let orchestrator = self
.orchestrator
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Container orchestrator not available"))?;
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
@@ -278,7 +341,7 @@ impl RpcHandler {
let lines = params.get("lines").and_then(|v| v.as_u64()).unwrap_or(100) as u32;
let logs = orchestrator
.get_container_logs(app_id, lines)
.logs(app_id, lines)
.await
.context("Failed to get container logs")?;
@@ -291,12 +354,13 @@ impl RpcHandler {
app_id: &str,
lines: u32,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
})?;
let orchestrator = self
.orchestrator
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Container orchestrator not available"))?;
let logs = orchestrator
.get_container_logs(app_id, lines)
.logs(app_id, lines)
.await
.context("Failed to get container logs")?;
@@ -307,43 +371,52 @@ impl RpcHandler {
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let orchestrator = self.orchestrator.as_ref().ok_or_else(|| {
anyhow::anyhow!("Container orchestrator not available (dev mode required)")
})?;
let orchestrator = self
.orchestrator
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Container orchestrator not available"))?;
// If app_id is provided, get health for that app
// If app_id is provided, get health for that app.
if let Some(params) = params {
if let Some(app_id) = params.get("app_id").and_then(|v| v.as_str()) {
let health = orchestrator
.get_health_status(app_id)
.health(app_id)
.await
.context("Failed to get container health")?;
return Ok(serde_json::json!({ app_id: health }));
}
}
// Otherwise, get health for all containers
// Otherwise, get health for all containers.
let containers = orchestrator
.list_containers()
.list()
.await
.context("Failed to list containers")?;
let mut health_map = serde_json::Map::new();
for container in containers {
if let Some(app_id) = container.name.strip_prefix("archipelago-") {
if let Some(app_id) = app_id.strip_suffix("-dev") {
match orchestrator.get_health_status(app_id).await {
Ok(health) => {
health_map
.insert(app_id.to_string(), serde_json::Value::String(health));
}
Err(_) => {
health_map.insert(
app_id.to_string(),
serde_json::Value::String("unknown".to_string()),
);
}
}
// Map the runtime container name back to the app_id the orchestrator
// knows about. Dev orchestrator uses `archipelago-<id>-dev`; Prod
// uses bare `<id>` (or `archy-<id>` for UIs — health() accepts the
// app_id either way since UI_APP_IDS is centralised).
let app_id_candidate = container
.name
.strip_prefix("archipelago-")
.and_then(|s| s.strip_suffix("-dev"))
.or_else(|| container.name.strip_prefix("archy-"))
.unwrap_or(container.name.as_str());
match orchestrator.health(app_id_candidate).await {
Ok(health) => {
health_map.insert(
app_id_candidate.to_string(),
serde_json::Value::String(health),
);
}
Err(_) => {
health_map.insert(
app_id_candidate.to_string(),
serde_json::Value::String("unknown".to_string()),
);
}
}
}
@@ -351,3 +424,90 @@ impl RpcHandler {
Ok(serde_json::Value::Object(health_map))
}
}
fn status_app_id_candidates(app_id: &str) -> Vec<String> {
let mut out = Vec::new();
let mut push = |s: &str| {
if !out.iter().any(|e: &String| e == s) {
out.push(s.to_string());
}
};
match app_id {
"bitcoin-knots" => {
push("bitcoin-knots");
push("bitcoin-core");
push("bitcoin");
}
"bitcoin-core" | "bitcoin" => {
push("bitcoin-core");
push("bitcoin-knots");
push("bitcoin");
}
"electrs" | "mempool-electrs" => {
push("electrs");
push("mempool-electrs");
push("electrumx");
}
_ => push(app_id),
}
out
}
fn status_container_name_candidates(app_id: &str) -> Vec<String> {
let mut out = Vec::new();
let mut push = |s: &str| {
if !out.iter().any(|e: &String| e == s) {
out.push(s.to_string());
}
};
match app_id {
"bitcoin-knots" | "bitcoin-core" | "bitcoin" => push("bitcoin-knots"),
"bitcoin-ui" => push("archy-bitcoin-ui"),
"lnd-ui" => push("archy-lnd-ui"),
"electrs-ui" => push("archy-electrs-ui"),
"electrs" | "mempool-electrs" => push("electrumx"),
_ => {}
}
push(app_id);
if let Some(stripped) = app_id.strip_prefix("archy-") {
push(stripped);
} else {
push(&format!("archy-{}", app_id));
}
out
}
async fn inspect_container_state_value(name: &str) -> Option<serde_json::Value> {
let out = tokio::process::Command::new("podman")
.args([
"inspect",
name,
"--format",
"{{.State.Status}} {{.State.Running}}",
])
.output()
.await
.ok()?;
if !out.status.success() {
return None;
}
let line = String::from_utf8_lossy(&out.stdout).trim().to_string();
if line.is_empty() {
return None;
}
let mut parts = line.split_whitespace();
let status = parts.next().unwrap_or("unknown");
let running = parts.next().unwrap_or("false") == "true";
Some(serde_json::json!({
"name": name,
"status": status,
"state": status,
"running": running,
}))
}

View File

@@ -231,8 +231,7 @@ impl RpcHandler {
let (data, _) = self.state_manager.get_snapshot().await;
let local_did = crate::identity::did_key_from_pubkey_hex(&data.server_info.pubkey)?;
let fips_npub =
crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let path = format!("/content/{}", content_id);
let (response, _transport) =
@@ -287,10 +286,13 @@ impl RpcHandler {
return Err(anyhow::anyhow!("Invalid v3 onion address"));
}
let fips_npub =
crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
debug!("Browsing peer content at {} (fips={})", onion, fips_npub.is_some());
debug!(
"Browsing peer content at {} (fips={})",
onion,
fips_npub.is_some()
);
let (response, _transport) =
crate::fips::dial::PeerRequest::new(fips_npub.as_deref(), onion, "/content")
@@ -348,8 +350,7 @@ impl RpcHandler {
let (data, _) = self.state_manager.get_snapshot().await;
let local_did = crate::identity::did_key_from_pubkey_hex(&data.server_info.pubkey)?;
let fips_npub =
crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let path = format!("/content/{}", content_id);
let (response, _transport) =
@@ -407,11 +408,15 @@ impl RpcHandler {
return Err(anyhow::anyhow!("Invalid v3 onion address"));
}
let fips_npub =
crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let path = format!("/content/{}/preview", content_id);
debug!("Fetching content preview from {}{} (fips={})", onion, path, fips_npub.is_some());
debug!(
"Fetching content preview from {}{} (fips={})",
onion,
path,
fips_npub.is_some()
);
let (response, _transport) =
crate::fips::dial::PeerRequest::new(fips_npub.as_deref(), onion, &path)

View File

@@ -1,10 +1,11 @@
use super::RpcHandler;
use anyhow::Result;
use std::sync::Arc;
impl RpcHandler {
/// Route an RPC method name to its handler, returning the result value.
pub(super) async fn dispatch(
&self,
self: &Arc<Self>,
method: &str,
params: Option<serde_json::Value>,
session_token: &Option<String>,
@@ -36,19 +37,23 @@ impl RpcHandler {
"container-install" => self.handle_container_install(params).await,
"container-start" => self.handle_container_start(params).await,
"container-stop" => self.handle_container_stop(params).await,
"container-restart" => self.handle_container_restart(params).await,
"container-remove" => self.handle_container_remove(params).await,
"container-list" => self.handle_container_list().await,
"container-status" => self.handle_container_status(params).await,
"container-logs" => self.handle_container_logs(params).await,
"container-health" => self.handle_container_health(params).await,
// Package management (for docker-compose apps)
"package.install" => self.handle_package_install(params).await,
// Package management (for docker-compose apps).
// install/uninstall/update return immediately with a
// transitional status; the actual work runs in a background
// tokio::spawn so the HTTP request doesn't block for minutes.
"package.install" => self.clone().spawn_package_install(params).await,
"package.start" => self.handle_package_start(params).await,
"package.stop" => self.handle_package_stop(params).await,
"package.restart" => self.handle_package_restart(params).await,
"package.uninstall" => self.handle_package_uninstall(params).await,
"package.update" => self.handle_package_update(params).await,
"package.uninstall" => self.clone().spawn_package_uninstall(params).await,
"package.update" => self.clone().spawn_package_update(params).await,
"app.filebrowser-token" => self.handle_filebrowser_token().await,
// Bundled app management (for pre-loaded container images)

View File

@@ -403,7 +403,10 @@ impl RpcHandler {
});
let own_fips_npub = match own_fips_npub {
Some(n) => Some(n),
None => crate::fips::service::read_upstream_npub().await.ok().flatten(),
None => crate::fips::service::read_upstream_npub()
.await
.ok()
.flatten(),
};
let state = federation::build_local_state(
@@ -461,8 +464,7 @@ impl RpcHandler {
// the entry causes sync loops where the node syncs with itself
// forever. Drop it quietly — no useful recovery path.
let (own_data, _) = self.state_manager.get_snapshot().await;
let own_did_result =
identity::did_key_from_pubkey_hex(&own_data.server_info.pubkey).ok();
let own_did_result = identity::did_key_from_pubkey_hex(&own_data.server_info.pubkey).ok();
let own_onion_trim = own_data
.server_info
.tor_address
@@ -568,11 +570,7 @@ impl RpcHandler {
let new_peer_did = did.to_string();
tokio::spawn(async move {
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
if let Err(e) = crate::federation::sync_with_peer_by_did(
&data_dir,
&new_peer_did,
)
.await
if let Err(e) = crate::federation::sync_with_peer_by_did(&data_dir, &new_peer_did).await
{
tracing::debug!(
peer_did = %new_peer_did,

View File

@@ -169,8 +169,7 @@ impl RpcHandler {
if !anchor.address.contains(':') {
anyhow::bail!("address must be host:port (e.g. 192.168.1.116:8668)");
}
let list =
fips::anchors::add(&self.config.data_dir, anchor.clone()).await?;
let list = fips::anchors::add(&self.config.data_dir, anchor.clone()).await?;
// Push just the newly-added anchor into the running daemon so
// the user sees effect without waiting for the periodic apply.
let results = fips::anchors::apply(&[anchor]).await;

View File

@@ -742,24 +742,25 @@ impl RpcHandler {
.ok_or_else(|| anyhow::anyhow!("Missing required parameter: id"))?;
validate_identity_id(id)?;
let relay_urls: Vec<String> = if let Some(arr) = params.get("relays").and_then(|v| v.as_array()) {
arr.iter()
.filter_map(|v| v.as_str())
.map(|s| s.to_string())
.collect()
} else if let Some(single) = params.get("relay").and_then(|v| v.as_str()) {
vec![single.to_string()]
} else {
// Default: every enabled relay in the user's Manage Relays list.
let statuses = crate::nostr_relays::list_relays(&self.config.data_dir)
.await
.unwrap_or_default();
statuses
.into_iter()
.filter(|s| s.enabled)
.map(|s| s.url)
.collect()
};
let relay_urls: Vec<String> =
if let Some(arr) = params.get("relays").and_then(|v| v.as_array()) {
arr.iter()
.filter_map(|v| v.as_str())
.map(|s| s.to_string())
.collect()
} else if let Some(single) = params.get("relay").and_then(|v| v.as_str()) {
vec![single.to_string()]
} else {
// Default: every enabled relay in the user's Manage Relays list.
let statuses = crate::nostr_relays::list_relays(&self.config.data_dir)
.await
.unwrap_or_default();
statuses
.into_iter()
.filter(|s| s.enabled)
.map(|s| s.url)
.collect()
};
if relay_urls.is_empty() {
anyhow::bail!("No enabled relays configured; add one under Manage Relays");

View File

@@ -3,7 +3,7 @@ use anyhow::{Context, Result};
use base64::Engine;
use serde::{Deserialize, Serialize};
use super::{LndAmount, LndBalanceResponse};
use super::{read_lnd_admin_macaroon, LndAmount, LndBalanceResponse};
#[derive(Debug, Serialize)]
struct LndInfo {
@@ -34,11 +34,7 @@ struct LndChannelBalanceResponse {
impl RpcHandler {
pub(in crate::api::rpc) async fn handle_lnd_getinfo(&self) -> Result<serde_json::Value> {
let macaroon_path = "/var/lib/archipelago/lnd/data/chain/bitcoin/mainnet/admin.macaroon";
let macaroon_bytes = tokio::fs::read(macaroon_path)
.await
.context("Failed to read LND admin macaroon — is LND installed?")?;
let macaroon_bytes = read_lnd_admin_macaroon().await?;
let macaroon_hex = hex::encode(&macaroon_bytes);
let client = reqwest::Client::builder()
@@ -114,7 +110,6 @@ impl RpcHandler {
/// for building lndconnect:// URIs in the frontend.
pub(crate) async fn handle_lnd_connect_info(&self) -> Result<serde_json::Value> {
let cert_path = "/var/lib/archipelago/lnd/tls.cert";
let macaroon_path = "/var/lib/archipelago/lnd/data/chain/bitcoin/mainnet/admin.macaroon";
// Read and encode TLS cert (PEM -> DER -> base64url)
let cert_pem = tokio::fs::read_to_string(cert_path)
@@ -130,9 +125,7 @@ impl RpcHandler {
let cert_b64url = base64::engine::general_purpose::URL_SAFE_NO_PAD.encode(&cert_der);
// Read and encode macaroon (binary -> base64url)
let macaroon_bytes = tokio::fs::read(macaroon_path)
.await
.context("Failed to read LND admin macaroon")?;
let macaroon_bytes = read_lnd_admin_macaroon().await?;
let macaroon_b64url =
base64::engine::general_purpose::URL_SAFE_NO_PAD.encode(&macaroon_bytes);
@@ -183,10 +176,7 @@ impl RpcHandler {
pub(in crate::api::rpc) async fn handle_lnd_export_channel_backup(
&self,
) -> Result<serde_json::Value> {
let macaroon_path = "/var/lib/archipelago/lnd/data/chain/bitcoin/mainnet/admin.macaroon";
let macaroon_bytes = tokio::fs::read(macaroon_path)
.await
.context("Failed to read LND admin macaroon")?;
let macaroon_bytes = read_lnd_admin_macaroon().await?;
let macaroon_hex = hex::encode(&macaroon_bytes);
let client = reqwest::Client::builder()

View File

@@ -4,7 +4,11 @@ mod payments;
mod wallet;
use crate::api::rpc::RpcHandler;
use anyhow::{Context, Result};
use anyhow::{anyhow, Context, Result};
/// Canonical on-host path for LND's admin macaroon.
pub(crate) const LND_ADMIN_MACAROON_PATH: &str =
"/var/lib/archipelago/lnd/data/chain/bitcoin/mainnet/admin.macaroon";
// Shared LND response types used by multiple submodules
#[derive(Debug, serde::Deserialize)]
@@ -17,15 +21,45 @@ pub(super) struct LndAmount {
pub sat: Option<String>,
}
/// Read LND's admin macaroon from disk.
///
/// The macaroon lives inside LND's container data dir and is owned by a
/// rootless-podman subordinate UID (typically 100000), mode 640. The
/// archipelago server runs as UID 1000 and therefore cannot read it
/// directly. We first try a plain read (works if an operator has relaxed
/// permissions), then fall back to `sudo cat` — mirroring the pattern
/// already used for Tor hidden-service hostnames.
pub(crate) async fn read_lnd_admin_macaroon() -> Result<Vec<u8>> {
match tokio::fs::read(LND_ADMIN_MACAROON_PATH).await {
Ok(bytes) => Ok(bytes),
Err(direct_err) => {
let output = tokio::process::Command::new("sudo")
.args(["-n", "cat", LND_ADMIN_MACAROON_PATH])
.output()
.await
.with_context(|| {
format!(
"Failed to read LND admin macaroon (direct: {direct_err}); sudo fallback also failed"
)
})?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(anyhow!(
"Failed to read LND admin macaroon — is LND installed? (direct: {direct_err}; sudo: {})",
stderr.trim()
));
}
Ok(output.stdout)
}
}
}
impl RpcHandler {
/// Helper: create an authenticated LND REST client.
/// Returns an HTTP client configured for LND's self-signed TLS and the
/// hex-encoded admin macaroon for request headers.
pub(crate) async fn lnd_client(&self) -> Result<(reqwest::Client, String)> {
let macaroon_path = "/var/lib/archipelago/lnd/data/chain/bitcoin/mainnet/admin.macaroon";
let macaroon_bytes = tokio::fs::read(macaroon_path)
.await
.context("Failed to read LND admin macaroon — is LND installed?")?;
let macaroon_bytes = read_lnd_admin_macaroon().await?;
let macaroon_hex = hex::encode(&macaroon_bytes);
let client = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(15))

View File

@@ -761,7 +761,9 @@ impl RpcHandler {
.await
.map_err(|e| anyhow::anyhow!("Read body failed: {}", e))?;
let meta = blob_store.put(&bytes, &mime, filename_hint, None, false).await?;
let meta = blob_store
.put(&bytes, &mime, filename_hint, None, false)
.await?;
if meta.cid != cid {
anyhow::bail!("CID mismatch: expected {}, got {}", cid, meta.cid);
}

View File

@@ -31,6 +31,7 @@ mod streaming;
mod system;
mod tor;
mod totp;
mod transitional;
mod transport;
mod update;
mod vpn;
@@ -39,7 +40,7 @@ mod webhooks;
use crate::auth::AuthManager;
use crate::config::Config;
use crate::container::DevContainerOrchestrator;
use crate::container::{ContainerOrchestrator, DevContainerOrchestrator};
use crate::monitoring::MetricsStore;
use crate::port_allocator::PortAllocator;
use crate::rate_limit::{EndpointRateLimiter, LoginRateLimiter};
@@ -62,7 +63,14 @@ pub(crate) const DEV_DEFAULT_PASSWORD: &str = "password123";
pub struct RpcHandler {
config: Config,
auth_manager: AuthManager,
orchestrator: Option<Arc<DevContainerOrchestrator>>,
/// Shared lifecycle orchestrator (Dev or Prod). Always `Some` in a normal
/// build — the only reason it is `Option` is so tests that don't exercise
/// container RPCs can skip constructing one.
orchestrator: Option<Arc<dyn ContainerOrchestrator>>,
/// Concrete handle to the dev orchestrator, when we're in dev mode. Used by
/// `container-install { manifest_path }` which takes an ad-hoc manifest
/// path and is not part of the shared trait.
dev_orchestrator: Option<Arc<DevContainerOrchestrator>>,
state_manager: Arc<StateManager>,
pub(crate) metrics_store: Arc<MetricsStore>,
port_allocator: Arc<tokio::sync::Mutex<PortAllocator>>,
@@ -79,6 +87,15 @@ pub struct RpcHandler {
/// Our own Ed25519 pubkey hex — needed by ContentRef senders for cap scoping
/// and by ContentRef receivers to request caps scoped to themselves.
pub(crate) self_pubkey_hex: Arc<tokio::sync::RwLock<Option<String>>>,
/// Kick the package scanner to run immediately (bypassing the 60s interval).
/// Used by install/update success paths so the fresh manifest (with populated
/// `interfaces.main.ui`) lands before we flip state to Running — closes the
/// "Launch button is missing for up to 60s after install" UX gap.
pub(crate) scan_kick: Arc<tokio::sync::Notify>,
/// Monotonic counter incremented by the scan loop after each completed scan.
/// Install/update success paths subscribe to this to know when a kicked scan
/// has actually finished before flipping to the terminal state.
pub(crate) scan_tick: Arc<tokio::sync::watch::Sender<u64>>,
}
impl RpcHandler {
@@ -87,15 +104,10 @@ impl RpcHandler {
state_manager: Arc<StateManager>,
metrics_store: Arc<MetricsStore>,
session_store: SessionStore,
orchestrator: Option<Arc<dyn ContainerOrchestrator>>,
dev_orchestrator: Option<Arc<DevContainerOrchestrator>>,
) -> Result<Self> {
let auth_manager = AuthManager::new(config.data_dir.clone());
let orchestrator = if config.dev_mode {
Some(Arc::new(
DevContainerOrchestrator::new(config.clone()).await?,
))
} else {
None
};
let port_allocator = Arc::new(tokio::sync::Mutex::new(
PortAllocator::new(&config.data_dir).await?,
));
@@ -129,6 +141,7 @@ impl RpcHandler {
config,
auth_manager,
orchestrator,
dev_orchestrator,
state_manager,
metrics_store,
port_allocator,
@@ -140,6 +153,8 @@ impl RpcHandler {
transport_router: Arc::new(tokio::sync::RwLock::new(None)),
blob_store: Arc::new(tokio::sync::RwLock::new(None)),
self_pubkey_hex: Arc::new(tokio::sync::RwLock::new(None)),
scan_kick: Arc::new(tokio::sync::Notify::new()),
scan_tick: Arc::new(tokio::sync::watch::channel(0u64).0),
})
}
@@ -180,6 +195,21 @@ impl RpcHandler {
Arc::clone(&self.mesh_service)
}
/// Shared Notify handle the package-scanner loop waits on (in addition to
/// its periodic tick). Install/update success paths call `notify_one()` to
/// trigger an immediate scan so the fresh manifest lands before we flip to
/// the terminal Running state.
pub fn scan_kick(&self) -> Arc<tokio::sync::Notify> {
Arc::clone(&self.scan_kick)
}
/// Sender half of the scan-completion watch channel. The scanner bumps this
/// counter after every finished scan; install/update wait for an advance
/// after kicking so they know the fresh manifest has landed.
pub fn scan_tick(&self) -> Arc<tokio::sync::watch::Sender<u64>> {
Arc::clone(&self.scan_tick)
}
fn cookie_suffix_for_request(&self, headers: &hyper::header::HeaderMap) -> &'static str {
// Only set Secure flag when the original request was over HTTPS.
// Nginx sends X-Forwarded-Proto: https for HTTPS connections.
@@ -197,7 +227,10 @@ impl RpcHandler {
""
}
pub async fn handle(&self, req: Request<hyper::Body>) -> Result<Response<hyper::Body>> {
pub async fn handle(
self: Arc<Self>,
req: Request<hyper::Body>,
) -> Result<Response<hyper::Body>> {
// Extract session cookie before consuming the request
let (parts, body) = req.into_parts();
let session_token = session::extract_session_cookie(&parts.headers);
@@ -376,7 +409,7 @@ impl RpcHandler {
// Route to handler (track latency for metrics)
let rpc_start = std::time::Instant::now();
let result = self.dispatch(&rpc_req.method, params, &session_token).await;
let result = Self::dispatch(&self, &rpc_req.method, params, &session_token).await;
// Record RPC latency for monitoring
let elapsed_ms = rpc_start.elapsed().as_secs_f64() * 1000.0;

View File

@@ -0,0 +1,452 @@
//! Async wrappers for `package.install`, `package.uninstall`, `package.update`.
//!
//! The inner `handle_package_*` functions are large (install is 480 lines with
//! the stack dispatchers, update is 300, uninstall is 200) and do their own
//! fine-grained progress tracking via `install_progress` and `uninstall_stage`.
//! We wrap them rather than refactor them.
//!
//! Each wrapper:
//! 1. Parses + validates the RPC params (cheap, synchronous). Errors here
//! return immediately to the caller before any state change.
//! 2. Flips the package state to the transitional variant
//! (`Installing` / `Removing` / `Updating`) so the UI sees it on the
//! next WebSocket push (before the RPC response even lands).
//! 3. `tokio::spawn`s a background task that invokes the existing
//! `handle_package_*` method on the Arc-held self.
//! 4. On task success: no state change needed — the inner handler has
//! already written the terminal state (Running for install/update, or
//! removed the entry for uninstall).
//! 5. On task failure: revert state to the pre-transition value (or delete
//! the entry for install, since there was no pre-state), write a line
//! to the persistent install log, and clear any stale progress fields.
//! 6. Returns `{ "status": "installing" }` etc. immediately.
//!
//! The server package-scan loop's `merge_preserving_transitional` helper
//! already knows to preserve `Installing` / `Removing` / `Updating` between
//! scans, so live progress updates broadcast from inside the spawned task
//! reach the UI correctly.
use super::install::install_log;
use crate::api::rpc::RpcHandler;
use crate::data_model::PackageState;
use crate::state::StateManager;
use anyhow::Result;
use std::sync::Arc;
use tracing::{error, info, warn};
impl RpcHandler {
/// Async wrapper for `package.install`. Returns `{ "status": "installing" }`
/// immediately after flipping state to `Installing` and spawning the
/// actual install pipeline. On failure, removes the package entry from
/// state so the UI reverts to "not installed".
pub(in crate::api::rpc) async fn spawn_package_install(
self: Arc<Self>,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
// Extract + validate package_id synchronously so bad params fail
// fast without touching state.
let params_val = params
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let package_id = params_val
.get("id")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing package id"))?
.to_string();
super::validation::validate_app_id(&package_id)?;
// Reject if already in a transitional lifecycle (prevents double-click
// queuing two installs on the same package).
{
let (data, _) = self.state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get(&package_id) {
if matches!(
entry.state,
PackageState::Installing | PackageState::Removing | PackageState::Updating
) {
return Err(anyhow::anyhow!(
"{} is already {:?}",
package_id,
entry.state
));
}
}
}
// Flip state to Installing BEFORE the spawn so the first WebSocket
// push carries the transitional state. Uses the same
// `create_installing_entry` path the inner handler would use once
// it starts pulling, so the UI sees a consistent shape.
flip_to_installing(&self.state_manager, &package_id).await;
install_log(&format!("INSTALL SPAWN: {}", package_id)).await;
let handler = Arc::clone(&self);
let package_id_spawn = package_id.clone();
tokio::spawn(async move {
match handler.handle_package_install(params).await {
Ok(_) => {
info!("package.install {}: complete", package_id_spawn);
// The install pipeline has verified the container is up
// and healthy (see install.rs post-start exit check).
// Kick the scanner first so the fresh manifest (with
// `interfaces.main.ui` from the live port binding) lands
// BEFORE we flip to Running — without this the Launch
// button is missing for up to 60s after a successful
// install, because the skeletal install-time manifest
// has `interfaces: None`.
kick_scanner_and_wait(&handler).await;
// We MUST explicitly transition out of Installing here:
// `merge_preserving_transitional` in the package-scan
// loop treats Installing as RPC-owned and refuses to
// let the scanner overwrite it with the observed
// Running state. Without this write, the entry stays
// stuck at Installing forever.
set_package_state(
&handler.state_manager,
&package_id_spawn,
PackageState::Running,
)
.await;
handler.clear_install_progress(&package_id_spawn).await;
}
Err(e) => {
error!("package.install {} failed: {:#}", package_id_spawn, e);
install_log(&format!("INSTALL FAIL: {}{:#}", package_id_spawn, e)).await;
// Don't remove the entry — that's what made the card
// vanish from My Apps mid-install / between retry-loop
// attempts (e.g. tailscale's entrypoint failure). Leave
// the entry visible with state=Stopped + the install
// error in install_progress.message so the user can see
// what went wrong and decide whether to retry or
// uninstall. clear_install_progress would erase the
// message, so we set it explicitly here instead.
let err_msg = format!("Install failed: {:#}", e);
let (mut data, _) = handler.state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get_mut(&package_id_spawn) {
entry.state = PackageState::Stopped;
entry.install_progress = Some(crate::data_model::InstallProgress {
size: 0,
downloaded: 0,
phase: None,
message: Some(err_msg),
});
handler.state_manager.update_data(data).await;
}
}
}
});
Ok(serde_json::json!({
"status": "installing",
"package_id": package_id,
}))
}
/// Async wrapper for `package.uninstall`. Returns `{ "status": "removing" }`
/// immediately. State stays `Removing` until the inner handler finishes
/// (including the `sudo rm -rf` of app data, which can take minutes for
/// bitcoin-core's chainstate). On failure, reverts to the pre-transition
/// state (usually Running or Stopped) so the user can retry.
pub(in crate::api::rpc) async fn spawn_package_uninstall(
self: Arc<Self>,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params_val = params
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let package_id = params_val
.get("id")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing package id"))?
.to_string();
super::validation::validate_app_id(&package_id)?;
// Reject if already in a transitional lifecycle.
{
let (data, _) = self.state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get(&package_id) {
if matches!(
entry.state,
PackageState::Installing | PackageState::Removing | PackageState::Updating
) {
return Err(anyhow::anyhow!(
"{} is already {:?}",
package_id,
entry.state
));
}
}
}
let pre_state =
flip_package_state(&self.state_manager, &package_id, PackageState::Removing).await;
install_log(&format!("UNINSTALL SPAWN: {}", package_id)).await;
let handler = Arc::clone(&self);
let package_id_spawn = package_id.clone();
tokio::spawn(async move {
match handler.handle_package_uninstall(params).await {
Ok(_) => {
info!("package.uninstall {}: complete", package_id_spawn);
// Inner handler already removed the package entry on
// success. Nothing more to do here.
}
Err(e) => {
error!("package.uninstall {} failed: {:#}", package_id_spawn, e);
install_log(&format!("UNINSTALL FAIL: {}{:#}", package_id_spawn, e)).await;
// Revert to pre-transition state so the user can retry.
// Also clear any stale uninstall_stage label.
if let Some(prev) = pre_state {
set_package_state_and_clear_uninstall_stage(
&handler.state_manager,
&package_id_spawn,
prev,
)
.await;
}
}
}
});
Ok(serde_json::json!({
"status": "removing",
"package_id": package_id,
}))
}
/// Async wrapper for `package.update`. Returns `{ "status": "updating" }`
/// immediately. The inner handler already manages its own rollback on
/// failure (restarts old containers); this wrapper just flips state and
/// spawns.
pub(in crate::api::rpc) async fn spawn_package_update(
self: Arc<Self>,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params_val = params
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let package_id = params_val
.get("id")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing package id"))?
.to_string();
super::validation::validate_app_id(&package_id)?;
// Reject if already in a transitional lifecycle.
{
let (data, _) = self.state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get(&package_id) {
if matches!(
entry.state,
PackageState::Installing | PackageState::Removing | PackageState::Updating
) {
return Err(anyhow::anyhow!(
"{} is already {:?}",
package_id,
entry.state
));
}
}
}
// The inner handler flips state to Updating itself, but we do it
// here too so the transitional state lands before the spawn yields.
let pre_state =
flip_package_state(&self.state_manager, &package_id, PackageState::Updating).await;
install_log(&format!("UPDATE SPAWN: {}", package_id)).await;
let handler = Arc::clone(&self);
let package_id_spawn = package_id.clone();
tokio::spawn(async move {
match handler.handle_package_update(params).await {
Ok(_) => {
info!("package.update {}: complete", package_id_spawn);
// Same reasoning as install: the merge_preserving_transitional
// helper treats Updating as RPC-owned, so we MUST write the
// terminal Running state ourselves or the entry will stay
// stuck at Updating forever. The update pipeline has
// already verified the new container is running via its
// post-recreate check.
// Kick the scanner first so any manifest changes from the
// new image version (interfaces, ports, etc.) land before
// we flip to Running.
kick_scanner_and_wait(&handler).await;
set_package_state(
&handler.state_manager,
&package_id_spawn,
PackageState::Running,
)
.await;
}
Err(e) => {
error!("package.update {} failed: {:#}", package_id_spawn, e);
install_log(&format!("UPDATE FAIL: {}{:#}", package_id_spawn, e)).await;
// Inner handler already ran rollback_update + cleared
// update state, but be defensive: revert to pre-state
// in case the inner flow died before its cleanup.
if let Some(prev) = pre_state {
set_package_state(&handler.state_manager, &package_id_spawn, prev).await;
}
}
}
});
Ok(serde_json::json!({
"status": "updating",
"package_id": package_id,
}))
}
}
// ---------------------------------------------------------------------------
// State-manager helpers (free fns, usable from inside spawned tasks)
// ---------------------------------------------------------------------------
/// Create or update the entry for this package with `Installing` state.
/// Matches what the inner handler's `set_install_progress` would do on first
/// call, but fires before the spawn so the UI sees it immediately.
async fn flip_to_installing(state_manager: &StateManager, package_id: &str) {
use crate::data_model::{Description, Manifest, PackageDataEntry, StaticFiles};
let (mut data, _) = state_manager.get_snapshot().await;
let entry = data
.package_data
.entry(package_id.to_string())
.or_insert_with(|| PackageDataEntry {
state: PackageState::Installing,
health: None,
exit_code: None,
static_files: StaticFiles {
license: String::new(),
instructions: String::new(),
// Leave icon empty during the transient Installing window:
// hardcoding `<id>.png` is wrong for ~half our apps (many use
// `.svg` / `.webp`), producing a broken-image flicker until
// the scanner refreshes the entry. The frontend's `icon`
// computed falls through to `curatedMap.get(id)?.icon` which
// has the correct extensions for known apps.
icon: String::new(),
},
manifest: Manifest {
id: package_id.to_string(),
title: package_id.to_string(),
version: String::new(),
description: Description {
short: "Installing...".to_string(),
long: String::new(),
},
release_notes: String::new(),
license: String::new(),
wrapper_repo: String::new(),
upstream_repo: String::new(),
support_site: String::new(),
marketing_site: String::new(),
donation_url: None,
author: None,
website: None,
interfaces: None,
tier: None,
},
installed: None,
install_progress: None,
uninstall_stage: None,
available_update: None,
});
entry.state = PackageState::Installing;
state_manager.update_data(data).await;
}
/// Flip an existing entry's state and return the pre-flip value (or None if
/// no entry existed). Used for revert-on-failure.
async fn flip_package_state(
state_manager: &StateManager,
package_id: &str,
new_state: PackageState,
) -> Option<PackageState> {
let (mut data, _) = state_manager.get_snapshot().await;
let prev = data.package_data.get(package_id).map(|e| e.state.clone());
if let Some(entry) = data.package_data.get_mut(package_id) {
entry.state = new_state;
state_manager.update_data(data).await;
} else {
warn!(
"flip_package_state: no entry for {} — cannot flip",
package_id
);
}
prev
}
/// Set state unconditionally (no-op if entry no longer exists).
async fn set_package_state(
state_manager: &StateManager,
package_id: &str,
new_state: PackageState,
) {
let (mut data, _) = state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get_mut(package_id) {
if entry.state != new_state {
entry.state = new_state;
state_manager.update_data(data).await;
}
}
}
/// Set state and clear the uninstall_stage label. Used when an uninstall
/// fails and we revert — the user doesn't want a stale "Removing app data"
/// message sitting on a Running entry.
async fn set_package_state_and_clear_uninstall_stage(
state_manager: &StateManager,
package_id: &str,
new_state: PackageState,
) {
let (mut data, _) = state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get_mut(package_id) {
entry.state = new_state;
entry.uninstall_stage = None;
state_manager.update_data(data).await;
}
}
/// Remove a package entry from state. Used for install-failure cleanup
/// (since there's no pre-state to revert to — the entry was created
/// speculatively when we flipped to Installing).
async fn remove_package_entry(state_manager: &StateManager, package_id: &str) {
let (mut data, _) = state_manager.get_snapshot().await;
if data.package_data.remove(package_id).is_some() {
state_manager.update_data(data).await;
}
}
/// Kick the container scanner to run immediately and wait for it to finish
/// (with a 2s timeout). Used by install/update success paths so the fresh
/// manifest — with `interfaces.main.ui` populated from the now-running
/// container's port binding — lands BEFORE we flip state to Running.
///
/// Without this, the frontend sees `state = running` but the skeletal
/// install-time manifest (interfaces = None), and hides the Launch button
/// for up to the full 60s scan interval.
///
/// The scan merges via `merge_preserving_transitional`, which keeps
/// state = Installing (we haven't flipped yet) while taking the fresh
/// manifest. After this returns, the caller writes Running on top of the
/// now-populated manifest.
async fn kick_scanner_and_wait(handler: &RpcHandler) {
let mut rx = handler.scan_tick.subscribe();
let start = *rx.borrow_and_update();
handler.scan_kick.notify_one();
// 2s is well above a typical podman scan (~200ms on .228, ~500ms worst
// case). If it times out we proceed anyway — the next 60s scan will
// self-heal and the worst case is the pre-fix behavior (Launch button
// appears a bit late).
let _ = tokio::time::timeout(std::time::Duration::from_secs(2), async {
while *rx.borrow_and_update() == start {
if rx.changed().await.is_err() {
break;
}
}
})
.await;
}

View File

@@ -9,7 +9,7 @@ pub(super) const TRUSTED_REGISTRIES: &[&str] = &[
"ghcr.io/",
"localhost/",
"git.tx1138.com/",
"23.182.128.160:3000/",
"146.59.87.168:3000/",
];
/// Validate Docker image against trusted registry allowlist.
@@ -29,7 +29,7 @@ pub(super) fn is_valid_docker_image(image: &str) -> bool {
};
matches!(
registry,
"docker.io" | "ghcr.io" | "localhost" | "git.tx1138.com" | "23.182.128.160:3000"
"docker.io" | "ghcr.io" | "localhost" | "git.tx1138.com" | "146.59.87.168:3000"
)
}
@@ -174,7 +174,7 @@ pub(super) fn get_health_check_args(app_id: &str, _rpc_pass: &str) -> Vec<String
("curl -sf http://localhost:8000/ || exit 1", "60s", "3")
}
"nextcloud" => (
"curl -sf http://localhost:80/status.php || exit 1",
"curl -s -o /dev/null http://localhost:80/status.php || exit 1",
"30s",
"3",
),
@@ -194,7 +194,12 @@ pub(super) fn get_health_check_args(app_id: &str, _rpc_pass: &str) -> Vec<String
"vaultwarden" => ("curl -sf http://localhost:80/alive || exit 1", "30s", "3"),
"uptime-kuma" => ("curl -sf http://localhost:3001/ || exit 1", "30s", "3"),
"filebrowser" => ("curl -sf http://localhost:80/health || exit 1", "30s", "3"),
"searxng" => ("curl -sf http://localhost:8080/ || exit 1", "30s", "3"),
"botfights" => (
"node -e \"fetch(\\\"http://127.0.0.1:9100/api/health\\\").then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))\"",
"30s",
"3",
),
"searxng" => ("wget -q -O /dev/null http://localhost:8080/ || exit 1", "30s", "3"),
"photoprism" => (
"curl -sf http://localhost:2342/api/v1/status || exit 1",
"60s",
@@ -210,11 +215,7 @@ pub(super) fn get_health_check_args(app_id: &str, _rpc_pass: &str) -> Vec<String
"30s",
"3",
),
"portainer" => (
"curl -sf http://localhost:9000/api/status || exit 1",
"30s",
"3",
),
"portainer" => return vec![],
"ollama" => ("curl -sf http://localhost:11434/ || exit 1", "30s", "3"),
"fedimint" => ("curl -sf http://localhost:8175/ || exit 1", "60s", "3"),
"fedimint-gateway" => ("curl -sf http://localhost:8176/ || exit 1", "60s", "3"),
@@ -243,13 +244,19 @@ pub(super) fn get_health_check_args(app_id: &str, _rpc_pass: &str) -> Vec<String
/// Get per-app memory limit.
pub(super) fn get_memory_limit(app_id: &str) -> &'static str {
match app_id {
// Heavy apps
"bitcoin" | "bitcoin-core" | "bitcoin-knots" => "4g",
// Heavy apps. Bitcoin: dbcache uses ~4GB; the daemon also needs
// headroom for mempool + connection buffers + script-verifier
// memory + I/O. 4g caused OOM-cascades during IBD. 8g is the
// floor; ideally this would be host-RAM aware (next pass).
"bitcoin" | "bitcoin-core" | "bitcoin-knots" => "8g",
// ElectrumX: bumped from 1g to 2g so its CACHE_MB has somewhere
// to live during initial blockchain indexing. CACHE_MB=2048 in
// env vars below requires this much.
"electrumx" | "mempool-electrs" | "electrs" => "2g",
"cryptpad" => "512m",
"ollama" => "4g",
// Medium apps
"lnd" => "512m",
"electrumx" | "mempool-electrs" | "electrs" => "1g",
"nextcloud" => "1g",
"immich_server" | "immich" => "1g",
"btcpay-server" | "btcpayserver" => "1g",
@@ -402,7 +409,6 @@ pub(super) fn get_data_dirs_for_app(package_id: &str) -> Vec<String> {
format!("{}/mempool", base),
format!("{}/mysql-mempool", base),
format!("{}/electrumx", base),
format!("{}/mempool-electrs", base),
],
"fedimint" => vec![
format!("{}/fedimint", base),
@@ -497,6 +503,16 @@ pub(super) async fn get_app_config(
// only what's in bitcoin.conf + argv. The shared bitcoin.conf
// carries rpcauth; we inject the networking flags as CLI args so
// RPC is reachable from the bitcoin-ui companion container.
//
// Sync-speed flags:
// -dbcache=4096 — UTXO set cache; 4GB is the sweet spot before
// diminishing returns. Container has --memory=8g now so
// there's headroom for mempool + connections.
// -par=0 — use all available cores for script
// verification (defaults to NCPU-1 capped at 16). Was
// effectively pinned at 2 by --cpus=2 (now removed).
// -maxconnections=125 — default but explicit, so ops can
// tune downward on bandwidth-constrained nodes.
Some(vec![
"-server=1".to_string(),
"-rpcbind=0.0.0.0".to_string(),
@@ -504,6 +520,9 @@ pub(super) async fn get_app_config(
"-rpcport=8332".to_string(),
"-printtoconsole=1".to_string(),
"-datadir=/home/bitcoin/.bitcoin".to_string(),
"-dbcache=4096".to_string(),
"-par=0".to_string(),
"-maxconnections=125".to_string(),
]),
),
"bitcoin" | "bitcoin-knots" => (
@@ -533,9 +552,7 @@ pub(super) async fn get_app_config(
"--bitcoin.node=bitcoind".to_string(),
format!("--bitcoind.rpcuser={}", rpc_user),
format!("--bitcoind.rpcpass={}", rpc_pass),
"--bitcoind.rpchost=host.containers.internal:8332".to_string(),
"--bitcoind.zmqpubrawblock=tcp://host.containers.internal:28332".to_string(),
"--bitcoind.zmqpubrawtx=tcp://host.containers.internal:28333".to_string(),
"--bitcoind.rpchost=bitcoin-knots:8332".to_string(),
"--rpclisten=0.0.0.0:10009".to_string(),
"--restlisten=0.0.0.0:8080".to_string(),
"--listen=0.0.0.0:9735".to_string(),
@@ -549,7 +566,8 @@ pub(super) async fn get_app_config(
"BTCPAY_PROTOCOL=http".to_string(),
format!("BTCPAY_HOST={}:23000", host_ip),
"BTCPAY_CHAINS=btc".to_string(),
format!("BTCPAY_BTCRPCURL=http://{}:8332", host_ip),
"BTCPAY_BTCEXPLORERURL=http://archy-nbxplorer:32838".to_string(),
"BTCPAY_BTCRPCURL=http://bitcoin-knots:8332".to_string(),
format!("BTCPAY_BTCRPCUSER={}", rpc_user),
format!("BTCPAY_BTCRPCPASSWORD={}", rpc_pass),
format!("BTCPAY_POSTGRES=User ID=btcpay;Password={};Host=archy-btcpay-db;Port=5432;Database=btcpay;Include Error Detail=true",
@@ -561,7 +579,7 @@ pub(super) async fn get_app_config(
"mempool" | "mempool-web" => (
vec!["4080:8080".to_string()],
vec![],
vec![format!("BACKEND_MAINNET_HTTP_HOST={}", host_ip)],
vec!["BACKEND_MAINNET_HTTP_HOST=mempool-api".to_string()],
None,
None,
),
@@ -570,12 +588,12 @@ pub(super) async fn get_app_config(
vec!["/var/lib/archipelago/mempool:/data".to_string()],
vec![
"MEMPOOL_BACKEND=electrum".to_string(),
"ELECTRUM_HOST=host.containers.internal".to_string(),
"ELECTRUM_HOST=electrumx".to_string(),
"ELECTRUM_PORT=50001".to_string(),
"ELECTRUM_TLS_ENABLED=false".to_string(),
format!("CORE_RPC_HOST={}", host_ip),
"CORE_RPC_HOST=bitcoin-knots".to_string(),
"CORE_RPC_PORT=8332".to_string(),
format!("CORE_RPC_USERNAME={}", rpc_user),
"CORE_RPC_USERNAME=archipelago".to_string(),
format!("CORE_RPC_PASSWORD={}", rpc_pass),
"DATABASE_ENABLED=true".to_string(),
"DATABASE_HOST=archy-mempool-db".to_string(),
@@ -592,12 +610,19 @@ pub(super) async fn get_app_config(
vec!["/var/lib/archipelago/electrumx:/data".to_string()],
vec![
format!(
"DAEMON_URL=http://{}:{}@host.containers.internal:8332/",
"DAEMON_URL=http://{}:{}@bitcoin-knots:8332/",
rpc_user, rpc_pass
),
"COIN=Bitcoin".to_string(),
"DB_DIRECTORY=/data".to_string(),
"SERVICES=tcp://:50001,rpc://0.0.0.0:8000".to_string(),
// Sync-speed: bigger LRU/write cache during initial
// history index. Default is 1200MB, container now
// gets 2g (config.rs::get_memory_limit) so 2048 fits.
"CACHE_MB=2048".to_string(),
// Block-fetcher concurrency — defaults are conservative
// for shared hosts; 4 is plenty for one bitcoind backend.
"MAX_SEND=10000000".to_string(),
],
None,
None,
@@ -610,7 +635,7 @@ pub(super) async fn get_app_config(
"MYSQL_DATABASE=mempool".to_string(),
"MYSQL_USER=mempool".to_string(),
format!("MYSQL_PASSWORD={}", read_secret("mempool-db-password", "mempoolpass")),
format!("MYSQL_ROOT_PASSWORD={}", read_secret("mempool-db-root-password", "rootpass")),
format!("MYSQL_ROOT_PASSWORD={}", read_secret("mysql-root-db-password", "rootpass")),
],
None,
None,
@@ -752,14 +777,14 @@ pub(super) async fn get_app_config(
vec!["9000:9000".to_string()],
vec![
"/var/lib/archipelago/portainer:/data".to_string(),
"/var/run/podman/podman.sock:/var/run/docker.sock".to_string(),
"/run/user/1000/podman/podman.sock:/var/run/docker.sock".to_string(),
],
vec![],
None,
None,
),
"uptime-kuma" => (
vec!["3001:3001".to_string()],
vec!["3002:3001".to_string()],
vec!["/var/lib/archipelago/uptime-kuma:/app/data".to_string()],
vec!["TZ=UTC".to_string()],
None,
@@ -769,10 +794,17 @@ pub(super) async fn get_app_config(
vec!["8240:8240".to_string()],
vec!["/var/lib/archipelago/tailscale:/var/lib/tailscale".to_string()],
vec!["TS_STATE_DIR=/var/lib/tailscale".to_string()],
Some(
"sh -c 'tailscale web --listen 0.0.0.0:8240 & exec tailscaled'".to_string(),
),
// Don't use custom_command (Option<String>) — install.rs passes
// it as a SINGLE arg to podman, which then treats the whole
// "sh -c 'tailscale web …'" string as the executable name and
// fails: "executable file `sh -c 'tailscale web …'` not found".
// custom_args (Option<Vec<String>>) splits properly.
None,
Some(vec![
"sh".to_string(),
"-c".to_string(),
"tailscale web --listen 0.0.0.0:8240 & exec tailscaled".to_string(),
]),
),
"fedimint" => (
vec![
@@ -791,13 +823,13 @@ pub(super) async fn get_app_config(
"FM_BIND_UI=0.0.0.0:8175".to_string(),
format!("FM_P2P_URL=fedimint://{}:8173", host_ip),
format!("FM_API_URL=ws://{}:8174", host_ip),
format!("FM_BITCOIND_URL=http://{}:8332", host_ip),
"FM_BITCOIND_URL=http://bitcoin-knots:8332".to_string(),
],
None,
Some(vec![
"--data-dir".to_string(),
"/data".to_string(),
format!("--bitcoind-url=http://{}:{}@{}:8332", rpc_user, rpc_pass, host_ip),
format!("--bitcoind-url=http://{}:{}@bitcoin-knots:8332", rpc_user, rpc_pass),
]),
),
"fedimint-gateway" => {
@@ -821,7 +853,7 @@ pub(super) async fn get_app_config(
"--network".to_string(),
"bitcoin".to_string(),
"--bitcoind-url".to_string(),
format!("http://{}:8332", host_ip),
"http://bitcoin-knots:8332".to_string(),
"--bitcoind-username".to_string(),
rpc_user.to_string(),
"--bitcoind-password".to_string(),

View File

@@ -9,14 +9,15 @@ use super::dependencies::{
use super::progress::parse_pull_progress;
use super::validation::validate_app_id;
use crate::api::rpc::RpcHandler;
use crate::data_model::InstallPhase;
use anyhow::{Context, Result};
use tokio::io::{AsyncBufReadExt, BufReader};
use tracing::{debug, info, warn};
const INSTALL_LOG: &str = "/var/log/archipelago-container-installs.log";
const INSTALL_LOG: &str = "/var/log/archipelago/container-installs.log";
/// Append a timestamped line to the persistent install log.
pub(super) async fn install_log(msg: &str) {
pub(in crate::api::rpc) async fn install_log(msg: &str) {
use tokio::io::AsyncWriteExt;
let ts = chrono::Utc::now().format("%Y-%m-%d %H:%M:%S UTC");
let line = format!("[{}] {}\n", ts, msg);
@@ -30,6 +31,124 @@ pub(super) async fn install_log(msg: &str) {
}
}
/// Patch the Bitcoin RPC `Authorization: Basic ...` header inside the running
/// bitcoin-ui container's nginx config and reload nginx. Authoritative
/// credential injection — runs whether the image was built locally or pulled
/// from the registry. Without this, registry images ship with whatever auth
/// header was baked at build time on the publisher's machine, which never
/// matches the per-node randomly-generated bitcoin-rpc-password.
///
/// Implementation note: this used to do `podman exec sed`, but rootless
/// podman + tightly-confined containers (--cap-drop=ALL, restricted user)
/// reject the exec because crun can't add a new process to the container's
/// cgroup ("write cgroup.procs: Permission denied"). Switched to
/// `podman cp` (storage layer, no cgroup join) + `podman kill --signal=SIGHUP`
/// (signal to existing PID 1, no new process needed). Verified on .228.
async fn inject_bitcoin_rpc_auth_into_running_container(container: &str, auth_b64: &str) {
use rand::distributions::{Alphanumeric, DistString};
let token = Alphanumeric.sample_string(&mut rand::thread_rng(), 8);
let host_path = format!("/tmp/archy-{container}-nginx.conf-{token}");
let in_container = "/etc/nginx/conf.d/default.conf";
// 1. Copy the running config out to host
let cp_out = tokio::process::Command::new("podman")
.args(["cp", &format!("{container}:{in_container}"), &host_path])
.output()
.await;
if let Err(e) = cp_out {
warn!("inject auth: podman cp out failed for {}: {}", container, e);
return;
}
if let Ok(ref o) = cp_out {
if !o.status.success() {
warn!(
"inject auth: podman cp out failed for {}: {}",
container,
String::from_utf8_lossy(&o.stderr)
);
return;
}
}
// 2. Patch the auth line on disk
let content = match tokio::fs::read_to_string(&host_path).await {
Ok(c) => c,
Err(e) => {
warn!("inject auth: read {} failed: {}", host_path, e);
let _ = tokio::fs::remove_file(&host_path).await;
return;
}
};
let mut patched_any = false;
let updated: String = content
.lines()
.map(|line| {
if line.contains("proxy_set_header Authorization") && line.contains("Basic") {
patched_any = true;
format!(
" proxy_set_header Authorization \"Basic {}\";",
auth_b64
)
} else {
line.to_string()
}
})
.collect::<Vec<_>>()
.join("\n");
if !patched_any {
warn!(
"inject auth: no Authorization line matched in {}'s nginx.conf",
container
);
let _ = tokio::fs::remove_file(&host_path).await;
return;
}
if let Err(e) = tokio::fs::write(&host_path, format!("{}\n", updated)).await {
warn!("inject auth: write back failed: {}", e);
let _ = tokio::fs::remove_file(&host_path).await;
return;
}
// 3. Copy patched config back into the container
let cp_in = tokio::process::Command::new("podman")
.args(["cp", &host_path, &format!("{container}:{in_container}")])
.output()
.await;
let _ = tokio::fs::remove_file(&host_path).await;
match cp_in {
Ok(o) if !o.status.success() => {
warn!(
"inject auth: podman cp in failed for {}: {}",
container,
String::from_utf8_lossy(&o.stderr)
);
return;
}
Err(e) => {
warn!("inject auth: podman cp in errored for {}: {}", container, e);
return;
}
_ => {}
}
// 4. Reload nginx via SIGHUP to PID 1 (no exec/cgroup join needed)
let reload = tokio::process::Command::new("podman")
.args(["kill", "--signal=SIGHUP", container])
.output()
.await;
match reload {
Ok(o) if o.status.success() => {
info!("Injected Bitcoin RPC auth into {} (post-start, cp+SIGHUP)", container);
}
Ok(o) => warn!(
"Patched nginx.conf in {} but SIGHUP failed: {}",
container,
String::from_utf8_lossy(&o.stderr)
),
Err(e) => warn!("Patched nginx.conf in {} but SIGHUP errored: {}", container, e),
}
}
impl RpcHandler {
/// Install a package from a Docker image.
/// Security: Image verification, resource limits, network isolation.
@@ -82,6 +201,16 @@ impl RpcHandler {
}
}
// Phase: Preparing — emit BEFORE the stack dispatch so multi-container
// stacks also flip state to Installing immediately. Without this, the
// backend's package state for stack apps stayed empty until the first
// podman pull finished, so a hard refresh during the early seconds of
// a stack install showed the app as missing entirely (the user
// reported "the app disappears from installing if you hard refresh
// then sometimes comes back later").
self.set_install_phase(package_id, InstallPhase::Preparing)
.await;
// Multi-container stacks get their own install path
if package_id == "immich" {
return self.install_immich_stack().await;
@@ -175,18 +304,79 @@ impl RpcHandler {
}));
}
// Preferred path for apps already modeled in the production orchestrator.
// Keep legacy install flow as default while migration is in progress.
if should_try_orchestrator_install(package_id, self.orchestrator.is_some()) {
let orchestrator_app_id = orchestrator_install_app_id(package_id);
self.set_install_phase(package_id, InstallPhase::CreatingContainer)
.await;
install_log(&format!(
"INSTALL ORCH: {} — attempting orchestrator install as {}",
package_id, orchestrator_app_id
))
.await;
if let Some(orchestrator) = self.orchestrator.as_ref() {
match orchestrator.install(orchestrator_app_id).await {
Ok(container_name) => {
self.set_install_phase(package_id, InstallPhase::WaitingHealthy)
.await;
install_log(&format!(
"INSTALL ORCH OK: {} (app={}) — container={}",
package_id, orchestrator_app_id, container_name
))
.await;
return Ok(serde_json::json!({
"success": true,
"package_id": package_id,
"container_name": container_name,
"message": format!("Package {} installed and started", package_id)
}));
}
Err(e) if is_unknown_app_id_error(&e) => {
info!(
"Install {}: orchestrator has no manifest mapping yet, falling back to legacy installer",
package_id
);
install_log(&format!(
"INSTALL ORCH SKIP: {} — unknown app_id, using legacy flow",
package_id
))
.await;
}
Err(e) => {
install_log(&format!("INSTALL ORCH FAIL: {}{}", package_id, e)).await;
return Err(
e.context(format!("Orchestrator install {} failed", package_id))
);
}
}
}
}
// Pull or verify image
install_log(&format!(
"INSTALL PULL: {} — pulling image {}",
package_id, docker_image
))
.await;
// Phase: PullingImage — the longest phase. Podman doesn't emit
// parseable progress on a piped stderr, so the UI shows an
// indeterminate "Downloading image…" at this fixed percentage
// until pull completes.
self.set_install_phase(package_id, InstallPhase::PullingImage)
.await;
let has_local_fallback = self.pull_or_verify_image(package_id, docker_image).await?;
install_log(&format!(
"INSTALL PULL OK: {} — image ready (local_fallback={})",
package_id, has_local_fallback
))
.await;
// Phase: CreatingContainer — image is local, now writing configs,
// data directories, chowning to container UID, building the run
// argv. Fast (sub-second to a few seconds).
self.set_install_phase(package_id, InstallPhase::CreatingContainer)
.await;
// Normalize container name for legacy aliases
let container_name = match package_id {
@@ -377,7 +567,26 @@ impl RpcHandler {
let memory_limit = get_memory_limit(package_id);
let mem_arg = format!("--memory={}", memory_limit);
run_args.push(&mem_arg);
run_args.push("--cpus=2");
// Bitcoin (and friends) need every core they can get during initial
// blockchain download — script verification is parallelizable and
// the limiting factor on most home boxes. --cpus=2 was halving sync
// speed for 4-8 core machines. ElectrumX likewise scales with cores
// during its initial reorg/indexing phase.
let cpu_capped = !matches!(
package_id,
"bitcoin" | "bitcoin-core" | "bitcoin-knots" | "electrumx" | "electrs" | "mempool-electrs"
);
if cpu_capped {
run_args.push("--cpus=2");
}
// Uptime Kuma image entrypoint (`extra/entrypoint.sh`) attempts
// `setpriv --clear-groups` and fails under our rootless + cap-drop
// defaults. Run the server directly via dumb-init to keep startup
// stable on production nodes.
if package_id == "uptime-kuma" {
run_args.push("--entrypoint=/usr/bin/dumb-init");
}
// Health checks
let health_args = get_health_check_args(package_id, &rpc_pass);
@@ -436,9 +645,21 @@ impl RpcHandler {
))
.await;
// Phase: StartingContainer — podman run accepted. Next we poll
// inspect until State.Status == running (up to 60s).
self.set_install_phase(package_id, InstallPhase::StartingContainer)
.await;
// Post-start health verification: wait up to 60s for container to be running
let mut container_running = false;
for i in 0..12u32 {
// After the first poll, flip the UI to WaitingHealthy — the
// container hasn't come up yet, so the phase label changes
// from "Starting container" to "Waiting for healthy".
if i == 1 {
self.set_install_phase(package_id, InstallPhase::WaitingHealthy)
.await;
}
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
let status = tokio::process::Command::new("podman")
.args(["inspect", container_name, "--format", "{{.State.Status}}"])
@@ -498,6 +719,12 @@ impl RpcHandler {
));
}
// Phase: PostInstall — container is up and running. Now any
// app-specific post-install (chain init, wallet setup, waiting
// for a first block). Varies by app; some are no-ops.
self.set_install_phase(package_id, InstallPhase::PostInstall)
.await;
// Post-install hooks — await completion before returning success
self.run_post_install_hooks(package_id).await;
@@ -770,20 +997,30 @@ impl RpcHandler {
/// Create data directories for volume mounts under /var/lib/archipelago/.
/// Get the mapped host UID for a container's internal UID.
/// Rootless podman maps container UIDs: host_uid = subuid_start + container_uid
/// Default subuid start for archipelago user is 100000.
/// Rootless podman UID maps commonly look like:
/// container 0 -> host real uid (e.g. 1000)
/// container 1.. -> host subuid range starting at 100000
/// So for uid>=1, host_uid = 99999 + container_uid.
fn mapped_uid(package_id: &str) -> u32 {
let container_uid = match package_id {
"bitcoin-knots" | "bitcoin" | "bitcoin-core" => 101,
"grafana" => 472,
"lnd" => 1000,
"mariadb" | "mysql" | "mysql-mempool" | "archy-mempool-db" => 999,
"postgres" | "btcpay-postgres" | "immich-postgres"
| "archy-btcpay-db" | "nextcloud-db" => 70,
"postgres" | "immich-postgres" | "nextcloud-db" => 70,
// Current BTCPay Postgres image runs as uid 999 inside the
// container, so its rootless host-mapped uid is 100998.
"btcpay-postgres" | "archy-btcpay-db" => 999,
"electrumx" | "electrs" => 1000,
_ => 0, // Most containers run as root (UID 0)
};
100000 + container_uid
if container_uid == 0 {
// Archipelago daemon runs as rootless user (typically uid 1000).
// Container uid 0 maps to that real host uid.
1000
} else {
99999 + container_uid
}
}
async fn create_data_dirs(&self, package_id: &str, volumes: &[String]) {
@@ -796,36 +1033,44 @@ impl RpcHandler {
debug!("Creating directory: {} (owner: {})", host_path, uid_str);
// Create directory directly (service has ReadWritePaths access).
// sudo is blocked by NoNewPrivileges=yes in the systemd service.
if let Err(e) = std::fs::create_dir_all(host_path) {
tracing::warn!("Failed to create directory {}: {}", host_path, e);
}
// Set ownership to the mapped UID for rootless podman.
// Try sudo chown first (works on LUKS), fall back to podman unshare.
// Try sudo chown first, then fall back to podman unshare
// for subuid-mapped UIDs only.
let host_uid = format!("{}:{}", uid, uid);
let sudo_result = tokio::process::Command::new("sudo")
.args(["chown", "-R", &host_uid, host_path])
.output()
.await;
let sudo_ok = sudo_result.as_ref().is_ok_and(|o| o.status.success());
if !sudo_ok {
// Fallback: podman unshare (works on non-LUKS ext4)
let container_uid = uid - 100000;
let container_uid_str = format!("{}:{}", container_uid, container_uid);
let chown_result = tokio::process::Command::new("podman")
.args(["unshare", "chown", "-R", &container_uid_str, host_path])
.output()
.await;
match chown_result {
Ok(out) if !out.status.success() => {
tracing::warn!(
"chown failed for {} (both sudo and podman unshare)",
host_path,
);
if uid >= 100000 {
let container_uid = uid - 100000;
let container_uid_str = format!("{}:{}", container_uid, container_uid);
let chown_result = tokio::process::Command::new("podman")
.args(["unshare", "chown", "-R", &container_uid_str, host_path])
.output()
.await;
match chown_result {
Ok(out) if !out.status.success() => {
tracing::warn!(
"chown failed for {} (both sudo and podman unshare)",
host_path,
);
}
Err(e) => tracing::warn!("Failed to chown {}: {}", host_path, e),
_ => {}
}
Err(e) => tracing::warn!("Failed to chown {}: {}", host_path, e),
_ => {}
} else {
tracing::warn!(
"chown fallback skipped for {}: host uid {} has no subuid mapping",
host_path,
uid
);
}
}
}
@@ -1193,54 +1438,18 @@ autopilot.active=false\n",
}
}
// Gitea: deploy nginx proxy on port 3000 to strip X-Frame-Options for iframe embedding.
// Gitea container runs on 3001, nginx proxies 3000->3001 removing the header.
// Gitea: keep it on its native host port (3001) and serve it under
// /app/gitea/ via the main Archipelago nginx config. Avoids colliding
// with Grafana, which also uses host port 3000.
if package_id == "gitea" {
let nginx_conf = r#"# Gitea iframe proxy — strips X-Frame-Options for Archipelago iframe
server {
listen 3000;
server_name _;
client_max_body_size 1G;
location / {
proxy_pass http://127.0.0.1:3001;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_hide_header X-Frame-Options;
proxy_hide_header Content-Security-Policy;
}
}
"#;
let conf_path = "/etc/nginx/conf.d/gitea-iframe.conf";
if let Err(e) = tokio::fs::write(conf_path, nginx_conf).await {
tracing::warn!("Failed to write gitea nginx conf: {}", e);
} else {
let reload = tokio::process::Command::new("nginx")
.args(["-s", "reload"])
.output()
.await;
match reload {
Ok(o) if o.status.success() => {
info!("Gitea: nginx iframe proxy deployed on port 3000");
}
Ok(o) => tracing::warn!(
"Gitea nginx reload failed: {}",
String::from_utf8_lossy(&o.stderr)
),
Err(e) => tracing::warn!("Gitea nginx reload error: {}", e),
}
}
let _ = tokio::fs::remove_file("/etc/nginx/conf.d/gitea-iframe.conf").await;
// Set ROOT_URL in Gitea config — port 3000 is the nginx iframe proxy,
// which is the public-facing port users and the UI iframe access.
// Set ROOT_URL to the UI path-based route so links/assets stay
// anchored under Archipelago's app proxy endpoint.
let host_ip = &self.config.host_ip;
let _ = tokio::process::Command::new("podman")
.args(["exec", "gitea", "sh", "-c",
&format!("grep -q ROOT_URL /data/gitea/conf/app.ini && sed -i 's|ROOT_URL.*|ROOT_URL = http://{}:3000/|' /data/gitea/conf/app.ini || true", host_ip)])
&format!("grep -q ROOT_URL /data/gitea/conf/app.ini && sed -i 's|ROOT_URL.*|ROOT_URL = http://{}/app/gitea/|' /data/gitea/conf/app.ini || true", host_ip)])
.output()
.await;
// Also ensure X_FRAME_OPTIONS is empty so Gitea doesn't send the header
@@ -1249,8 +1458,15 @@ server {
"grep -q X_FRAME_OPTIONS /data/gitea/conf/app.ini && sed -i 's|X_FRAME_OPTIONS.*|X_FRAME_OPTIONS =|' /data/gitea/conf/app.ini || sed -i '/^\\[security\\]/a X_FRAME_OPTIONS =' /data/gitea/conf/app.ini"])
.output()
.await;
// Reload main nginx so /app/gitea/ routing changes take effect.
let _ = tokio::process::Command::new("nginx")
.args(["-s", "reload"])
.output()
.await;
info!(
"Gitea: ROOT_URL set to http://{}:3000/, X_FRAME_OPTIONS cleared",
"Gitea: ROOT_URL set to http://{}/app/gitea/, X_FRAME_OPTIONS cleared",
host_ip
);
}
@@ -1285,41 +1501,66 @@ server {
info!("Nextcloud trusted domains configured for {}", host_ip);
}
// Pre-build: inject Bitcoin RPC auth into bitcoin-ui nginx.conf
if matches!(package_id, "bitcoin" | "bitcoin-core" | "bitcoin-knots") {
let (rpc_user, rpc_pass) = crate::bitcoin_rpc::bitcoin_rpc_credentials().await;
use base64::Engine;
let auth_b64 = base64::engine::general_purpose::STANDARD
.encode(format!("{}:{}", rpc_user, rpc_pass));
for dir in [
"/opt/archipelago/docker/bitcoin-ui",
"/home/archipelago/archy/docker/bitcoin-ui",
] {
let conf_path = format!("{}/nginx.conf", dir);
if let Ok(content) = tokio::fs::read_to_string(&conf_path).await {
// Replace placeholder or previously-injected auth (regex: Basic followed by base64 or placeholder)
let updated = content
.replace("__BITCOIN_RPC_AUTH__", &auth_b64)
.lines()
.map(|line| {
if line.contains("proxy_set_header Authorization")
&& line.contains("Basic")
// Inject Bitcoin RPC auth into bitcoin-ui nginx.conf.
// Two paths because the credential is per-node and randomly generated
// at first boot, so it can't be baked into the published registry image:
// 1. Build-time: rewrite nginx.conf on disk before `podman build`.
// Only fires when /opt/archipelago/docker/bitcoin-ui exists (dev
// box or ISO that shipped the docker tree). Skipped silently in
// production where ui_builds falls through to the registry image.
// 2. Post-start: `podman exec` into the running container to patch
// nginx.conf and reload. Authoritative for both paths — runs
// regardless of how the image was built.
let bitcoin_rpc_auth_b64: Option<String> =
if matches!(package_id, "bitcoin" | "bitcoin-core" | "bitcoin-knots") {
let (rpc_user, rpc_pass) = crate::bitcoin_rpc::bitcoin_rpc_credentials().await;
use base64::Engine;
let auth_b64 = base64::engine::general_purpose::STANDARD
.encode(format!("{}:{}", rpc_user, rpc_pass));
for dir in [
"/opt/archipelago/docker/bitcoin-ui",
"/home/archipelago/archy/docker/bitcoin-ui",
] {
let conf_path = format!("{}/nginx.conf", dir);
match tokio::fs::read_to_string(&conf_path).await {
Ok(content) => {
let updated = content
.replace("__BITCOIN_RPC_AUTH__", &auth_b64)
.lines()
.map(|line| {
if line.contains("proxy_set_header Authorization")
&& line.contains("Basic")
{
format!(
" proxy_set_header Authorization \"Basic {}\";",
auth_b64
)
} else {
line.to_string()
}
})
.collect::<Vec<_>>()
.join("\n");
if let Err(e) =
tokio::fs::write(&conf_path, format!("{}\n", updated)).await
{
format!(
" proxy_set_header Authorization \"Basic {}\";",
auth_b64
)
warn!("Failed to write {} with injected RPC auth: {}", conf_path, e);
} else {
line.to_string()
info!("Injected Bitcoin RPC auth into {} (build-time)", conf_path);
}
})
.collect::<Vec<_>>()
.join("\n");
let _ = tokio::fs::write(&conf_path, format!("{}\n", updated)).await;
info!("Injected Bitcoin RPC auth into {}", conf_path);
}
Err(_) => {
debug!(
"No build-time nginx.conf at {} (will patch running container after start)",
conf_path
);
}
}
}
}
}
Some(auth_b64)
} else {
None
};
// Build and start companion UI containers for headless services.
// All UIs proxy to localhost (backend :5678 or bitcoin :8332) so they need --network=host.
@@ -1356,9 +1597,14 @@ server {
.find(|d| std::path::Path::new(d).join("Dockerfile").exists())
.unwrap_or_else(|| ui_dir.to_string());
let image_base = image_base.to_string();
let registry = "git.tx1138.com/lfg2025";
let registry = "146.59.87.168:3000/lfg2025";
let registry_image = format!("{}/{}:latest", registry, image_base);
let local_image = format!("localhost/{}:latest", image_base);
let post_start_auth = if name == "archy-bitcoin-ui" {
bitcoin_rpc_auth_b64.clone()
} else {
None
};
tokio::spawn(async move {
// Remove existing container
let _ = tokio::process::Command::new("podman")
@@ -1406,32 +1652,69 @@ server {
}
};
// For bitcoin-ui specifically: render nginx.conf to host BEFORE
// starting the container, then bind-mount it. This is the durable
// fix for the bitcoin-rpc 401 — the per-node password is in the
// file before nginx ever opens it. Survives container recreate,
// image update, reboot, --restart=unless-stopped cycles, and
// doesn't need any post-start patching that could fail under
// tightly-confined cgroup permissions.
let mut bitcoin_ui_mount: Option<String> = None;
if name == "archy-bitcoin-ui" {
let paths = crate::container::bitcoin_ui::RenderPaths::default();
match crate::container::bitcoin_ui::render(&paths).await {
Ok(outcome) => {
bitcoin_ui_mount = Some(format!(
"{}:/etc/nginx/conf.d/default.conf:ro,Z",
paths.rendered_path.display()
));
info!(
"bitcoin-ui nginx.conf rendered ({:?}) — will bind-mount at startup",
outcome
);
}
Err(e) => warn!(
"Failed to render bitcoin-ui nginx.conf: {} — \
will fall back to post-start patch (less reliable)",
e
),
}
}
// Run with --network=host (UIs proxy to localhost backend/bitcoin)
// --user 0:0: run as root inside container (still unprivileged on host
// in rootless podman) to avoid nginx chown failures
let mut args: Vec<String> = vec![
"run".into(),
"-d".into(),
"--name".into(),
name.clone(),
"--restart=unless-stopped".into(),
"--network=host".into(),
"--user=0:0".into(),
"--cap-drop=ALL".into(),
"--cap-add=CHOWN".into(),
"--cap-add=DAC_OVERRIDE".into(),
"--cap-add=NET_BIND_SERVICE".into(),
"--cap-add=SETUID".into(),
"--cap-add=SETGID".into(),
"--memory=128m".into(),
];
if let Some(ref mount) = bitcoin_ui_mount {
args.push("-v".into());
args.push(mount.clone());
}
args.push(image.clone());
let run = tokio::process::Command::new("podman")
.args([
"run",
"-d",
"--name",
&name,
"--restart=unless-stopped",
"--network=host",
"--user=0:0",
"--cap-drop=ALL",
"--cap-add=CHOWN",
"--cap-add=DAC_OVERRIDE",
"--cap-add=NET_BIND_SERVICE",
"--cap-add=SETUID",
"--cap-add=SETGID",
"--memory=128m",
&image,
])
.args(&args)
.output()
.await;
match run {
Ok(o) if o.status.success() => {
info!("{} UI container started (host network)", name)
info!("{} UI container started (host network)", name);
if let Some(ref auth) = post_start_auth {
inject_bitcoin_rpc_auth_into_running_container(&name, auth).await;
}
}
Ok(o) => warn!(
"Failed to start {}: {}",
@@ -1537,9 +1820,15 @@ server {
// Reassign priorities: target = 0, everyone else = 10, 20, 30…
// in their existing priority order.
let target_url = url.to_string();
config.registries.sort_by_key(|r| (r.url != target_url, r.priority));
config
.registries
.sort_by_key(|r| (r.url != target_url, r.priority));
for (i, r) in config.registries.iter_mut().enumerate() {
r.priority = if r.url == target_url { 0 } else { (i as u32) * 10 };
r.priority = if r.url == target_url {
0
} else {
(i as u32) * 10
};
}
crate::container::registry::save_registries(&self.config.data_dir, &config).await?;
@@ -1561,7 +1850,7 @@ server {
.unwrap_or(true);
// Registries are configured as `host[:port]/namespace` (for
// example `23.182.128.160:3000/lfg2025`), but the Docker V2
// example `146.59.87.168:3000/lfg2025`), but the Docker V2
// registry API lives at `/v2/` on the ROOT of the host — NOT
// under the namespace. Strip the namespace before appending
// `/v2/` so the reachability probe hits the correct URL.
@@ -1667,3 +1956,102 @@ async fn resolve_host_gateway() -> String {
// Last resort
"--add-host=host.containers.internal:10.0.2.2".to_string()
}
fn should_try_orchestrator_install(package_id: &str, orchestrator_available: bool) -> bool {
orchestrator_available && uses_orchestrator_install_flow(package_id)
}
fn orchestrator_install_app_id(package_id: &str) -> &str {
match package_id {
"bitcoin-knots" => "bitcoin-core",
"electrs" | "mempool-electrs" => "electrumx",
_ => package_id,
}
}
fn uses_orchestrator_install_flow(package_id: &str) -> bool {
matches!(
package_id,
// Step 7 UI apps
"bitcoin-ui"
| "electrs-ui"
| "lnd-ui"
// Step 8b backend ports
| "bitcoin-core"
| "bitcoin-knots"
| "lnd"
| "fedimint"
| "fedimint-gateway"
| "filebrowser"
| "electrumx"
| "electrs"
| "mempool-electrs"
| "archy-mempool-db"
| "mempool-api"
| "archy-mempool-web"
| "archy-btcpay-db"
| "archy-nbxplorer"
| "btcpay-server"
)
}
fn is_unknown_app_id_error(err: &anyhow::Error) -> bool {
err.chain()
.any(|cause| cause.to_string().contains("unknown app_id"))
}
#[cfg(test)]
mod tests {
use super::{
orchestrator_install_app_id, should_try_orchestrator_install,
uses_orchestrator_install_flow,
};
#[test]
fn orchestrator_install_allowlist_includes_ported_backends() {
for app in [
"bitcoin-ui",
"electrs-ui",
"lnd-ui",
"bitcoin-core",
"bitcoin-knots",
"lnd",
"fedimint",
"fedimint-gateway",
"filebrowser",
"electrumx",
"electrs",
"mempool-electrs",
"archy-mempool-db",
"mempool-api",
"archy-mempool-web",
"archy-btcpay-db",
"archy-nbxplorer",
"btcpay-server",
] {
assert!(uses_orchestrator_install_flow(app));
assert!(should_try_orchestrator_install(app, true));
}
}
#[test]
fn non_allowlisted_apps_stay_legacy_install() {
for app in ["searxng", "mempool", "indeedhub", "immich", "penpot"] {
assert!(!uses_orchestrator_install_flow(app));
assert!(!should_try_orchestrator_install(app, true));
}
}
#[test]
fn missing_orchestrator_disables_orchestrator_install() {
assert!(!should_try_orchestrator_install("bitcoin-ui", false));
}
#[test]
fn install_aliases_map_to_manifest_app_ids() {
assert_eq!(orchestrator_install_app_id("bitcoin-knots"), "bitcoin-core");
assert_eq!(orchestrator_install_app_id("electrs"), "electrumx");
assert_eq!(orchestrator_install_app_id("mempool-electrs"), "electrumx");
assert_eq!(orchestrator_install_app_id("lnd"), "lnd");
}
}

View File

@@ -1,3 +1,4 @@
mod async_lifecycle;
mod config;
mod dependencies;
mod install;
@@ -8,5 +9,6 @@ mod stacks;
mod update;
mod validation;
// Re-export items needed by sibling modules (container.rs, security.rs)
// Re-export items needed by sibling modules (container.rs, security.rs, transitional.rs)
pub(in crate::api::rpc) use install::install_log;
pub(super) use validation::validate_app_id;

View File

@@ -2,12 +2,17 @@
use crate::api::rpc::RpcHandler;
use crate::data_model::{
Description, InstallProgress, Manifest, PackageDataEntry, PackageState, StaticFiles,
Description, InstallPhase, InstallProgress, Manifest, PackageDataEntry, PackageState,
StaticFiles,
};
impl RpcHandler {
/// Set install progress for a package and broadcast the update.
/// Creates a minimal package entry if one doesn't exist yet.
///
/// Prefer `set_install_phase` — this byte-counter API is kept for
/// the rare case where the pull stream actually parses, but podman
/// almost never emits parseable progress on a piped stderr.
pub(super) async fn set_install_progress(&self, package_id: &str, downloaded: u64, size: u64) {
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let entry = data
@@ -15,7 +20,44 @@ impl RpcHandler {
.entry(package_id.to_string())
.or_insert_with(|| create_installing_entry(package_id));
entry.state = PackageState::Installing;
entry.install_progress = Some(InstallProgress { size, downloaded });
let existing_phase = entry.install_progress.as_ref().and_then(|p| p.phase);
entry.install_progress = Some(InstallProgress {
size,
downloaded,
phase: existing_phase,
message: None,
});
self.state_manager.update_data(data).await;
}
/// Set the install pipeline phase and broadcast. This is the
/// primary progress signal — the UI maps each phase to a
/// percentage and a user-facing label. Byte counters are retained
/// for the rare case podman emits parseable progress.
pub(super) async fn set_install_phase(&self, package_id: &str, phase: InstallPhase) {
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let entry = data
.package_data
.entry(package_id.to_string())
.or_insert_with(|| create_installing_entry(package_id));
// Preparing / PullingImage / CreatingContainer / StartingContainer /
// WaitingHealthy / PostInstall all map to the Installing state.
// Updates use Updating state — the wrapper has already flipped
// state to Updating, so don't clobber it.
if entry.state != PackageState::Updating {
entry.state = PackageState::Installing;
}
let (size, downloaded) = entry
.install_progress
.as_ref()
.map(|p| (p.size, p.downloaded))
.unwrap_or((0, 0));
entry.install_progress = Some(InstallProgress {
size,
downloaded,
phase: Some(phase),
message: None,
});
self.state_manager.update_data(data).await;
}
@@ -52,9 +94,12 @@ impl RpcHandler {
.package_data
.entry(package_id.to_string())
.or_insert_with(|| create_installing_entry(package_id));
let existing_phase = entry.install_progress.as_ref().and_then(|p| p.phase);
entry.install_progress = Some(InstallProgress {
size: total,
downloaded,
phase: existing_phase,
message: None,
});
state_manager.update_data(data).await;
}
@@ -69,7 +114,11 @@ fn create_installing_entry(package_id: &str) -> PackageDataEntry {
static_files: StaticFiles {
license: String::new(),
instructions: String::new(),
icon: format!("/assets/img/app-icons/{}.png", package_id),
// Empty icon: hardcoding `<id>.png` is wrong for apps that use
// `.svg` or `.webp` assets and produces a broken-image flicker.
// The frontend's `icon` computed falls through to the curated
// map which has correct extensions for known apps.
icon: String::new(),
},
manifest: Manifest {
id: package_id.to_string(),

View File

@@ -3,7 +3,10 @@ use super::dependencies::ordered_containers_for_start;
use super::install::install_log;
use super::validation::validate_app_id;
use crate::api::rpc::RpcHandler;
use crate::data_model::PackageState;
use anyhow::{Context, Result};
use std::sync::Arc;
use tracing::warn;
/// Per-container graceful shutdown timeout in seconds.
/// Bitcoin Core needs 600s to flush UTXO set, LND 330s for channel state,
@@ -25,6 +28,12 @@ pub fn stop_timeout_secs(container_name: &str) -> &'static str {
impl RpcHandler {
/// Start a package: start all containers in dependency order.
///
/// Returns immediately with `{ "status": "starting" }` after flipping
/// the package state to `Starting` in the StateManager. The actual
/// podman-start sequence + post-start exit verification runs in a
/// background task. On success the state becomes `Running`; on error
/// it reverts to the pre-transition state.
pub(in crate::api::rpc) async fn handle_package_start(
&self,
params: Option<serde_json::Value>,
@@ -42,83 +51,52 @@ impl RpcHandler {
return Err(anyhow::anyhow!("No containers found for {}", package_id));
}
// Clear user-stopped flag — user explicitly started this app
// Clear user-stopped flag — user explicitly started this app.
// Must happen BEFORE the spawn (ordering contract with crash recovery).
crate::crash_recovery::clear_user_stopped(&self.config.data_dir, package_id).await;
for name in &to_start {
crate::crash_recovery::clear_user_stopped(&self.config.data_dir, name).await;
}
let package_id_owned = package_id.to_string();
let state_manager = Arc::clone(&self.state_manager);
let pre_state =
flip_package_state(&state_manager, &package_id_owned, PackageState::Starting).await;
install_log(&format!(
"START: {} (containers: {:?})",
package_id, to_start
package_id_owned, to_start
))
.await;
let mut errors = Vec::new();
for (i, name) in to_start.iter().enumerate() {
// Brief delay between dependent containers to allow initialization
if i > 0 {
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
}
tracing::info!("Starting container: {}", name);
let out = tokio::process::Command::new("podman")
.args(["start", name])
.output()
.await
.context(format!("Failed to exec podman start {}", name))?;
if !out.status.success() {
let stderr = String::from_utf8_lossy(&out.stderr).trim().to_string();
tracing::error!("Failed to start {}: {}", name, stderr);
install_log(&format!("START FAIL: {}{}", name, stderr)).await;
errors.push(format!("{}: {}", name, stderr));
}
}
if !errors.is_empty() {
return Err(anyhow::anyhow!("Start failed: {}", errors.join("; ")));
}
// Verify containers actually reached running state (podman start can
// succeed even if the container exits immediately after)
tokio::time::sleep(std::time::Duration::from_secs(3)).await;
for name in &to_start {
let status = tokio::process::Command::new("podman")
.args(["inspect", name, "--format", "{{.State.Status}}"])
.output()
.await;
if let Ok(o) = status {
let state = String::from_utf8_lossy(&o.stdout).trim().to_string();
if state == "exited" {
let logs = tokio::process::Command::new("podman")
.args(["logs", "--tail", "5", name])
.output()
tokio::spawn(async move {
match do_package_start(&to_start).await {
Ok(()) => {
set_package_state(&state_manager, &package_id_owned, PackageState::Running)
.await;
let log_text = logs
.map(|o| {
let combined = format!(
"{}{}",
String::from_utf8_lossy(&o.stdout),
String::from_utf8_lossy(&o.stderr)
);
combined.chars().take(200).collect::<String>()
})
.unwrap_or_default();
tracing::error!("Container {} exited after start: {}", name, log_text);
install_log(&format!("START EXITED: {}{}", name, log_text)).await;
errors.push(format!("{}: exited after start", name));
}
Err(e) => {
tracing::error!("package.start {} failed: {:#}", package_id_owned, e);
install_log(&format!("START FAIL: {}{:#}", package_id_owned, e)).await;
if let Some(prev) = pre_state {
set_package_state(&state_manager, &package_id_owned, prev).await;
} else {
warn!(
"package.start {}: no pre-state recorded; relying on next scan",
package_id_owned
);
}
}
}
}
});
if !errors.is_empty() {
return Err(anyhow::anyhow!(
"Containers exited after start: {}",
errors.join("; ")
));
}
Ok(serde_json::Value::Null)
Ok(serde_json::json!({ "status": "starting" }))
}
/// Stop a package: mark as user-stopped and stop all containers.
///
/// Returns immediately with `{ "status": "stopping" }`. podman stop
/// (up to 600s for bitcoin-core) runs in the background.
pub(in crate::api::rpc) async fn handle_package_stop(
&self,
params: Option<serde_json::Value>,
@@ -136,43 +114,48 @@ impl RpcHandler {
return Err(anyhow::anyhow!("No containers found for {}", package_id));
}
install_log(&format!(
"STOP: {} (containers: {:?})",
package_id, containers
))
.await;
// Mark as user-stopped so health monitor and crash recovery don't auto-restart
// Mark as user-stopped BEFORE the spawn so health monitor and
// crash recovery don't auto-restart mid-flight. Ordering is
// load-bearing — see runtime.rs:145-148 original note.
crate::crash_recovery::mark_user_stopped(&self.config.data_dir, package_id).await;
for name in &containers {
crate::crash_recovery::mark_user_stopped(&self.config.data_dir, name).await;
}
let mut errors = Vec::new();
for name in &containers {
tracing::info!(
"Stopping container: {} (timeout: {}s)",
name,
stop_timeout_secs(name)
);
let out = tokio::process::Command::new("podman")
.args(["stop", "-t", stop_timeout_secs(name), name])
.output()
.await
.context(format!("Failed to exec podman stop {}", name))?;
if !out.status.success() {
let stderr = String::from_utf8_lossy(&out.stderr).trim().to_string();
tracing::error!("Failed to stop {}: {}", name, stderr);
errors.push(format!("{}: {}", name, stderr));
}
}
let package_id_owned = package_id.to_string();
let state_manager = Arc::clone(&self.state_manager);
let pre_state =
flip_package_state(&state_manager, &package_id_owned, PackageState::Stopping).await;
if !errors.is_empty() {
return Err(anyhow::anyhow!("Stop failed: {}", errors.join("; ")));
}
Ok(serde_json::Value::Null)
install_log(&format!(
"STOP: {} (containers: {:?})",
package_id_owned, containers
))
.await;
tokio::spawn(async move {
match do_package_stop(&containers).await {
Ok(()) => {
set_package_state(&state_manager, &package_id_owned, PackageState::Stopped)
.await;
}
Err(e) => {
tracing::error!("package.stop {} failed: {:#}", package_id_owned, e);
install_log(&format!("STOP FAIL: {}{:#}", package_id_owned, e)).await;
if let Some(prev) = pre_state {
set_package_state(&state_manager, &package_id_owned, prev).await;
}
}
}
});
Ok(serde_json::json!({ "status": "stopping" }))
}
/// Restart a package: restart all containers.
///
/// Returns immediately with `{ "status": "restarting" }`. The restart
/// (up to 600s per container for bitcoin-core) runs in the background.
pub(in crate::api::rpc) async fn handle_package_restart(
&self,
params: Option<serde_json::Value>,
@@ -190,55 +173,42 @@ impl RpcHandler {
return Err(anyhow::anyhow!("No containers found for {}", package_id));
}
// Restart does not mark user-stopped; user wants the app to keep
// running. Clear any lingering marker so downstream layers don't
// interpret the brief podman stop as user intent.
crate::crash_recovery::clear_user_stopped(&self.config.data_dir, package_id).await;
for name in &containers {
crate::crash_recovery::clear_user_stopped(&self.config.data_dir, name).await;
}
let package_id_owned = package_id.to_string();
let state_manager = Arc::clone(&self.state_manager);
let pre_state =
flip_package_state(&state_manager, &package_id_owned, PackageState::Restarting).await;
install_log(&format!(
"RESTART: {} (containers: {:?})",
package_id, containers
package_id_owned, containers
))
.await;
let mut errors = Vec::new();
for name in &containers {
tracing::info!("Restarting container: {}", name);
let out = tokio::process::Command::new("podman")
.args(["restart", "-t", stop_timeout_secs(name), name])
.output()
.await
.context(format!("Failed to exec podman restart {}", name))?;
if !out.status.success() {
let stderr = String::from_utf8_lossy(&out.stderr).trim().to_string();
tracing::warn!(
"podman restart {} failed: {}, trying stop+start",
name,
stderr
);
// Fallback: stop then start (handles rootless podman loopback issues)
let _ = tokio::process::Command::new("podman")
.args(["stop", "-t", stop_timeout_secs(name), name])
.output()
.await;
let start_out = tokio::process::Command::new("podman")
.args(["start", name])
.output()
.await
.context(format!("Failed to exec podman start {}", name))?;
if !start_out.status.success() {
let start_err = String::from_utf8_lossy(&start_out.stderr)
.trim()
.to_string();
tracing::error!("stop+start {} also failed: {}", name, start_err);
errors.push(format!("{}: {}", name, start_err));
} else {
tracing::info!("Restarted {} via stop+start fallback", name);
tokio::spawn(async move {
match do_package_restart(&containers).await {
Ok(()) => {
set_package_state(&state_manager, &package_id_owned, PackageState::Running)
.await;
}
Err(e) => {
tracing::error!("package.restart {} failed: {:#}", package_id_owned, e);
install_log(&format!("RESTART FAIL: {}{:#}", package_id_owned, e)).await;
if let Some(prev) = pre_state {
set_package_state(&state_manager, &package_id_owned, prev).await;
}
}
}
}
});
if !errors.is_empty() {
return Err(anyhow::anyhow!("Restart failed: {}", errors.join("; ")));
}
Ok(serde_json::Value::Null)
Ok(serde_json::json!({ "status": "restarting" }))
}
/// Uninstall a package: stop and remove all related containers, clean data.
@@ -342,7 +312,8 @@ impl RpcHandler {
}
}
self.set_uninstall_stage(package_id, "Cleaning up volumes").await;
self.set_uninstall_stage(package_id, "Cleaning up volumes")
.await;
// Clean up dangling volumes associated with removed containers
let _ = tokio::process::Command::new("podman")
.args(["volume", "prune", "-f"])
@@ -371,7 +342,8 @@ impl RpcHandler {
// Clean data directories unless preserve_data
if !preserve_data {
self.set_uninstall_stage(package_id, "Removing app data").await;
self.set_uninstall_stage(package_id, "Removing app data")
.await;
let data_dirs = get_data_dirs_for_app(package_id);
for dir in &data_dirs {
tracing::info!("Uninstall {}: removing data {}", package_id, dir);
@@ -579,3 +551,185 @@ impl RpcHandler {
Ok(serde_json::json!({ "status": "stopped", "app_id": app_id }))
}
}
// ---------------------------------------------------------------------------
// Background workers for async package lifecycle RPCs.
//
// Extracted from the pre-async RPC handlers so the transitional state is
// visible to the UI immediately. Each worker is pure IO over podman + the
// crash_recovery helpers — no StateManager access here so we don't need
// a handler reference. The caller does state flipping before/after.
// ---------------------------------------------------------------------------
/// Start containers in dependency order. Includes the post-start 3s wait +
/// exit-check verification from the original synchronous handler (critical
/// for catching "podman start succeeded but container immediately exited"
/// failure modes).
async fn do_package_start(to_start: &[String]) -> Result<()> {
let mut errors = Vec::new();
for (i, name) in to_start.iter().enumerate() {
if i > 0 {
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
}
tracing::info!("Starting container: {}", name);
let out = tokio::process::Command::new("podman")
.args(["start", name])
.output()
.await
.context(format!("Failed to exec podman start {}", name))?;
if !out.status.success() {
let stderr = String::from_utf8_lossy(&out.stderr).trim().to_string();
tracing::error!("Failed to start {}: {}", name, stderr);
install_log(&format!("START FAIL: {}{}", name, stderr)).await;
errors.push(format!("{}: {}", name, stderr));
}
}
if !errors.is_empty() {
return Err(anyhow::anyhow!("Start failed: {}", errors.join("; ")));
}
// Post-start exit verification (podman start can succeed even if the
// container exits immediately after).
tokio::time::sleep(std::time::Duration::from_secs(3)).await;
for name in to_start {
let status = tokio::process::Command::new("podman")
.args(["inspect", name, "--format", "{{.State.Status}}"])
.output()
.await;
if let Ok(o) = status {
let state = String::from_utf8_lossy(&o.stdout).trim().to_string();
if state == "exited" {
let logs = tokio::process::Command::new("podman")
.args(["logs", "--tail", "5", name])
.output()
.await;
let log_text = logs
.map(|o| {
let combined = format!(
"{}{}",
String::from_utf8_lossy(&o.stdout),
String::from_utf8_lossy(&o.stderr)
);
combined.chars().take(200).collect::<String>()
})
.unwrap_or_default();
tracing::error!("Container {} exited after start: {}", name, log_text);
install_log(&format!("START EXITED: {}{}", name, log_text)).await;
errors.push(format!("{}: exited after start", name));
}
}
}
if !errors.is_empty() {
return Err(anyhow::anyhow!(
"Containers exited after start: {}",
errors.join("; ")
));
}
Ok(())
}
/// Stop all containers with their per-container graceful-shutdown timeout.
async fn do_package_stop(containers: &[String]) -> Result<()> {
let mut errors = Vec::new();
for name in containers {
tracing::info!(
"Stopping container: {} (timeout: {}s)",
name,
stop_timeout_secs(name)
);
let out = tokio::process::Command::new("podman")
.args(["stop", "-t", stop_timeout_secs(name), name])
.output()
.await
.context(format!("Failed to exec podman stop {}", name))?;
if !out.status.success() {
let stderr = String::from_utf8_lossy(&out.stderr).trim().to_string();
tracing::error!("Failed to stop {}: {}", name, stderr);
errors.push(format!("{}: {}", name, stderr));
}
}
if !errors.is_empty() {
return Err(anyhow::anyhow!("Stop failed: {}", errors.join("; ")));
}
Ok(())
}
/// Restart via `podman restart`, falling back to stop+start when restart
/// fails (rootless podman loopback issues).
async fn do_package_restart(containers: &[String]) -> Result<()> {
let mut errors = Vec::new();
for name in containers {
tracing::info!("Restarting container: {}", name);
let out = tokio::process::Command::new("podman")
.args(["restart", "-t", stop_timeout_secs(name), name])
.output()
.await
.context(format!("Failed to exec podman restart {}", name))?;
if !out.status.success() {
let stderr = String::from_utf8_lossy(&out.stderr).trim().to_string();
tracing::warn!(
"podman restart {} failed: {}, trying stop+start",
name,
stderr
);
// Fallback: stop then start
let _ = tokio::process::Command::new("podman")
.args(["stop", "-t", stop_timeout_secs(name), name])
.output()
.await;
let start_out = tokio::process::Command::new("podman")
.args(["start", name])
.output()
.await
.context(format!("Failed to exec podman start {}", name))?;
if !start_out.status.success() {
let start_err = String::from_utf8_lossy(&start_out.stderr)
.trim()
.to_string();
tracing::error!("stop+start {} also failed: {}", name, start_err);
errors.push(format!("{}: {}", name, start_err));
} else {
tracing::info!("Restarted {} via stop+start fallback", name);
}
}
}
if !errors.is_empty() {
return Err(anyhow::anyhow!("Restart failed: {}", errors.join("; ")));
}
Ok(())
}
/// Flip the primary package entry's state and return the pre-transition
/// state for revert on error. Mirrors `transitional::flip_to_transitional`
/// but lives here because the package path keys by `package_id` (which may
/// differ from the container name used by orchestrator-level entries).
async fn flip_package_state(
state_manager: &crate::state::StateManager,
package_id: &str,
transitional: PackageState,
) -> Option<PackageState> {
let (mut data, _) = state_manager.get_snapshot().await;
let prev = data.package_data.get(package_id).map(|e| e.state.clone());
if let Some(entry) = data.package_data.get_mut(package_id) {
entry.state = transitional;
state_manager.update_data(data).await;
}
prev
}
/// Write the package entry's final state. No-op if the entry has since
/// been removed (uninstall race).
async fn set_package_state(
state_manager: &crate::state::StateManager,
package_id: &str,
new_state: PackageState,
) {
let (mut data, _) = state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get_mut(package_id) {
if entry.state != new_state {
entry.state = new_state;
state_manager.update_data(data).await;
}
}
}

View File

@@ -4,6 +4,7 @@
//! containers in dependency order.
use crate::api::rpc::RpcHandler;
use crate::data_model::InstallPhase;
use anyhow::{Context, Result};
use tracing::info;
@@ -60,7 +61,71 @@ async fn adopt_stack_if_exists(
})))
}
const REGISTRY: &str = "git.tx1138.com/lfg2025";
async fn install_stack_via_orchestrator(
handler: &RpcHandler,
stack_name: &str,
app_ids: &[&str],
) -> Result<Option<serde_json::Value>> {
let Some(orchestrator) = handler.orchestrator.as_ref() else {
return Ok(None);
};
install_log(&format!(
"INSTALL ORCH: {} stack — attempting orchestrator install of [{}]",
stack_name,
app_ids.join(", ")
))
.await;
for app_id in app_ids {
match orchestrator.install(app_id).await {
Ok(container_name) => {
install_log(&format!(
"INSTALL ORCH: {} stack — app {} installed as {}",
stack_name, app_id, container_name
))
.await;
}
Err(e) if e.to_string().contains("unknown app_id") => {
install_log(&format!(
"INSTALL ORCH SKIP: {} stack — app {} unknown, falling back to legacy stack installer",
stack_name, app_id
))
.await;
return Ok(None);
}
Err(e) => {
install_log(&format!(
"INSTALL ORCH FAIL: {} stack — app {} failed: {}",
stack_name, app_id, e
))
.await;
return Err(e.context(format!(
"orchestrator stack install {} failed at app {}",
stack_name, app_id
)));
}
}
}
install_log(&format!("INSTALL ORCH OK: {} stack", stack_name)).await;
Ok(Some(serde_json::json!({
"success": true,
"package_id": stack_name,
"message": format!("{} stack installed and started", stack_name),
"path": "orchestrator"
})))
}
fn btcpay_stack_app_ids() -> &'static [&'static str] {
&["archy-btcpay-db", "archy-nbxplorer", "btcpay-server"]
}
fn mempool_stack_app_ids() -> &'static [&'static str] {
&["archy-mempool-db", "mempool-api", "archy-mempool-web"]
}
const REGISTRY: &str = "146.59.87.168:3000/lfg2025";
/// Pull an image with retry and exponential backoff (3 attempts).
async fn pull_image_with_retry(image: &str) -> Result<()> {
@@ -135,13 +200,20 @@ impl RpcHandler {
}
let images = [
"git.tx1138.com/lfg2025/immich-postgres:14-vectorchord0.4.3-pgvectors0.2.0",
"git.tx1138.com/lfg2025/valkey:7-alpine",
"git.tx1138.com/lfg2025/immich-server:release",
"146.59.87.168:3000/lfg2025/immich-postgres:14-vectorchord0.4.3-pgvectors0.2.0",
"146.59.87.168:3000/lfg2025/valkey:7-alpine",
"146.59.87.168:3000/lfg2025/immich-server:release",
];
for img in &images {
self.set_install_phase("immich", InstallPhase::PullingImage)
.await;
let n_images = images.len() as u64;
for (i, img) in images.iter().enumerate() {
self.set_install_progress("immich", i as u64, n_images).await;
pull_image_with_retry(img).await?;
}
self.set_install_progress("immich", n_images, n_images).await;
self.set_install_phase("immich", InstallPhase::CreatingContainer)
.await;
let _ = tokio::process::Command::new("sudo")
.args([
@@ -152,6 +224,16 @@ impl RpcHandler {
])
.output()
.await;
let _ = tokio::process::Command::new("sudo")
.args([
"chown",
"-R",
"1000:1000",
"/var/lib/archipelago/immich",
"/var/lib/archipelago/immich-db",
])
.output()
.await;
let _ = tokio::process::Command::new("podman")
.args(["network", "create", "immich-net"])
.output()
@@ -191,7 +273,7 @@ impl RpcHandler {
"POSTGRES_USER=postgres",
"-e",
"POSTGRES_DB=immich",
"git.tx1138.com/lfg2025/immich-postgres:14-vectorchord0.4.3-pgvectors0.2.0",
"146.59.87.168:3000/lfg2025/immich-postgres:14-vectorchord0.4.3-pgvectors0.2.0",
])
.output()
.await;
@@ -210,13 +292,15 @@ impl RpcHandler {
"--network-alias",
"immich_redis",
"--cap-drop=ALL",
"--cap-add=SETGID",
"--cap-add=SETUID",
"--security-opt=no-new-privileges:true",
"--memory=128m",
"--pids-limit=2048",
"--health-cmd=valkey-cli ping || exit 1",
"--health-interval=30s",
"--health-retries=3",
"git.tx1138.com/lfg2025/valkey:7-alpine",
"146.59.87.168:3000/lfg2025/valkey:7-alpine",
])
.output()
.await;
@@ -254,7 +338,7 @@ impl RpcHandler {
"REDIS_HOSTNAME=immich_redis",
"-e",
"UPLOAD_LOCATION=/usr/src/app/upload",
"git.tx1138.com/lfg2025/immich-server:release",
"146.59.87.168:3000/lfg2025/immich-server:release",
])
.output()
.await
@@ -265,6 +349,13 @@ impl RpcHandler {
return Err(anyhow::anyhow!("Failed to start Immich server: {}", stderr));
}
self.set_install_phase("immich", InstallPhase::WaitingHealthy)
.await;
self.set_install_phase("immich", InstallPhase::PostInstall)
.await;
self.set_install_phase("immich", InstallPhase::Done).await;
self.clear_install_progress("immich").await;
info!("Immich stack installed and started");
Ok(serde_json::json!({
"success": true,
@@ -273,7 +364,6 @@ impl RpcHandler {
}))
}
/// Install BTCPay stack (postgres + nbxplorer + btcpay-server).
pub(super) async fn install_btcpay_stack(&self) -> Result<serde_json::Value> {
if let Some(adopted) = adopt_stack_if_exists(
@@ -286,6 +376,12 @@ impl RpcHandler {
return Ok(adopted);
}
if let Some(orchestrated) =
install_stack_via_orchestrator(self, "btcpay-server", btcpay_stack_app_ids()).await?
{
return Ok(orchestrated);
}
// Dependency check: Bitcoin must be running
let deps = super::dependencies::detect_running_deps().await?;
super::dependencies::check_install_deps("btcpay-server", &deps)?;
@@ -303,9 +399,18 @@ impl RpcHandler {
&format!("{}/nbxplorer:2.6.0", REGISTRY),
&format!("{}/btcpayserver:1.13.7", REGISTRY),
];
for img in &images {
self.set_install_phase("btcpay-server", InstallPhase::PullingImage)
.await;
let n_images = images.len() as u64;
for (i, img) in images.iter().enumerate() {
self.set_install_progress("btcpay-server", i as u64, n_images)
.await;
pull_image_with_retry(img).await?;
}
self.set_install_progress("btcpay-server", n_images, n_images)
.await;
self.set_install_phase("btcpay-server", InstallPhase::CreatingContainer)
.await;
// Create data dirs (chown to current user so rootless podman can write)
let _ = tokio::process::Command::new("sudo")
@@ -460,6 +565,14 @@ impl RpcHandler {
return Err(anyhow::anyhow!("Failed to start BTCPay Server: {}", stderr));
}
self.set_install_phase("btcpay-server", InstallPhase::WaitingHealthy)
.await;
self.set_install_phase("btcpay-server", InstallPhase::PostInstall)
.await;
self.set_install_phase("btcpay-server", InstallPhase::Done)
.await;
self.clear_install_progress("btcpay-server").await;
install_log("INSTALL OK: btcpay-server stack").await;
info!("BTCPay stack installed and started");
Ok(serde_json::json!({
@@ -473,34 +586,52 @@ impl RpcHandler {
/// Install Mempool stack (mariadb + mempool-api + mempool-web).
pub(super) async fn install_mempool_stack(&self) -> Result<serde_json::Value> {
if let Some(adopted) = adopt_stack_if_exists(
"archy-mempool-web",
"mempool",
&["archy-mempool-db", "archy-mempool-api", "archy-mempool-web"],
"mempool",
&[
"archy-mempool-db",
"mempool-api",
"mempool",
"archy-mempool-web",
"archy-mempool-api",
],
)
.await?
{
return Ok(adopted);
}
if let Some(orchestrated) =
install_stack_via_orchestrator(self, "mempool", mempool_stack_app_ids()).await?
{
return Ok(orchestrated);
}
// Dependency check: Bitcoin + ElectrumX must be running
let deps = super::dependencies::detect_running_deps().await?;
super::dependencies::check_install_deps("mempool", &deps)?;
let (_, rpc_pass) = crate::bitcoin_rpc::bitcoin_rpc_credentials().await;
install_log("INSTALL START: mempool (stack: mariadb + mempool-api + mempool-web)").await;
let (rpc_user, rpc_pass) = crate::bitcoin_rpc::bitcoin_rpc_credentials().await;
let db_pass = super::config::read_or_generate_secret("mempool-db-password").await;
let root_pass = super::config::read_or_generate_secret("mempool-db-root-password").await;
let root_pass = super::config::read_or_generate_secret("mysql-root-db-password").await;
let images = [
&format!("{}/mariadb:11.4.10", REGISTRY),
&format!("{}/mempool-backend:v3.0.0", REGISTRY),
&format!("{}/mempool-frontend:v3.0.0", REGISTRY),
];
for img in &images {
self.set_install_phase("mempool", InstallPhase::PullingImage)
.await;
let n_images = images.len() as u64;
for (i, img) in images.iter().enumerate() {
self.set_install_progress("mempool", i as u64, n_images).await;
pull_image_with_retry(img).await?;
}
self.set_install_progress("mempool", n_images, n_images).await;
self.set_install_phase("mempool", InstallPhase::CreatingContainer)
.await;
// Create data dirs (chown to current user so rootless podman can write)
let _ = tokio::process::Command::new("sudo")
@@ -594,17 +725,17 @@ impl RpcHandler {
"-e",
"MEMPOOL_BACKEND=electrum",
"-e",
"ELECTRUM_HOST=host.containers.internal",
"ELECTRUM_HOST=electrumx",
"-e",
"ELECTRUM_PORT=50001",
"-e",
"ELECTRUM_TLS_ENABLED=false",
"-e",
"CORE_RPC_HOST=host.containers.internal",
"CORE_RPC_HOST=bitcoin-knots",
"-e",
"CORE_RPC_PORT=8332",
"-e",
&format!("CORE_RPC_USERNAME={}", rpc_user),
"CORE_RPC_USERNAME=archipelago",
"-e",
&format!("CORE_RPC_PASSWORD={}", rpc_pass),
"-e",
@@ -658,6 +789,13 @@ impl RpcHandler {
return Err(anyhow::anyhow!("Failed to start Mempool: {}", stderr));
}
self.set_install_phase("mempool", InstallPhase::WaitingHealthy)
.await;
self.set_install_phase("mempool", InstallPhase::PostInstall)
.await;
self.set_install_phase("mempool", InstallPhase::Done).await;
self.clear_install_progress("mempool").await;
install_log("INSTALL OK: mempool stack").await;
info!("Mempool stack installed and started");
Ok(serde_json::json!({
@@ -677,7 +815,7 @@ impl RpcHandler {
.into_iter()
.find(|r| r.enabled)
.map(|r| r.url)
.unwrap_or_else(|| "git.tx1138.com/lfg2025".to_string());
.unwrap_or_else(|| "146.59.87.168:3000/lfg2025".to_string());
let user_tmp = format!(
"{}/.local/share/containers/tmp",
@@ -702,12 +840,22 @@ impl RpcHandler {
// Pull all images with retry; fail the install if any image can't be pulled.
// Previously this just logged a warning and continued, leaving the stack
// broken and the user seeing "failed" with no recovery path.
for img in &images {
self.set_install_phase("indeedhub", InstallPhase::PullingImage)
.await;
let n_images = images.len() as u64;
for (i, img) in images.iter().enumerate() {
// set_install_progress fills the byte-counter fallback the UI uses
// when it can't read podman's pull output — gives the bar a clear
// X-of-N step as each image lands.
self.set_install_progress("indeedhub", i as u64, n_images)
.await;
info!("Pulling {}", img);
pull_image_with_retry(img)
.await
.with_context(|| format!("Failed to pull IndeedHub image: {}", img))?;
}
self.set_install_progress("indeedhub", n_images, n_images)
.await;
// Remove any leftover containers from a previous partial install (or
// from the first-boot frontend stub that used to race the installer).
@@ -734,6 +882,12 @@ impl RpcHandler {
.status()
.await;
// Phase: CreatingContainer — pulls done, network rebuilt, now spinning
// up the 7 stack containers. Bar advances from PullingImage band into
// CreatingContainer band so the user sees movement.
self.set_install_phase("indeedhub", InstallPhase::CreatingContainer)
.await;
// Create indeedhub-net
let _ = tokio::process::Command::new("podman")
.args(["network", "create", "indeedhub-net"])
@@ -860,26 +1014,38 @@ impl RpcHandler {
"-e",
"DATABASE_HOST=postgres",
"-e",
"DATABASE_PORT=5432",
"-e",
"DATABASE_USER=indeedhub",
"-e",
&format!("DATABASE_PASSWORD={}", db_pass),
"-e",
"DATABASE_NAME=indeedhub",
"-e",
"REDIS_HOST=redis",
"QUEUE_HOST=redis",
"-e",
"QUEUE_PORT=6379",
"-e",
"S3_ENDPOINT=http://minio:9000",
"-e",
"AWS_REGION=us-east-1",
"-e",
&format!("AWS_ACCESS_KEY={}", minio_user),
"-e",
&format!("AWS_SECRET_KEY={}", minio_pass),
"-e",
"S3_PUBLIC_BUCKET_NAME=indeedhub-public",
"-e",
"S3_PRIVATE_BUCKET_NAME=indeedhub-private",
"-e",
"S3_PUBLIC_BUCKET_URL=/storage",
"-e",
&format!("NOSTR_JWT_SECRET={}", jwt_secret),
"-e",
"NOSTR_JWT_EXPIRES_IN=7d",
"-e",
"AES_MASTER_SECRET=0123456789abcdef0123456789abcdef",
"-e",
"ENVIRONMENT=production",
&format!("{}/indeedhub-api:1.0.0", registry),
])
@@ -901,6 +1067,8 @@ impl RpcHandler {
"-e",
"DATABASE_HOST=postgres",
"-e",
"DATABASE_PORT=5432",
"-e",
"DATABASE_USER=indeedhub",
"-e",
&format!("DATABASE_PASSWORD={}", db_pass),
@@ -909,6 +1077,8 @@ impl RpcHandler {
"-e",
"QUEUE_HOST=redis",
"-e",
"QUEUE_PORT=6379",
"-e",
"S3_ENDPOINT=http://minio:9000",
"-e",
&format!("AWS_ACCESS_KEY={}", minio_user),
@@ -919,6 +1089,8 @@ impl RpcHandler {
"-e",
"S3_PUBLIC_BUCKET_NAME=indeedhub-public",
"-e",
"S3_PRIVATE_BUCKET_NAME=indeedhub-private",
"-e",
"ENVIRONMENT=production",
"-e",
"AES_MASTER_SECRET=0123456789abcdef0123456789abcdef",
@@ -956,6 +1128,17 @@ impl RpcHandler {
return Err(anyhow::anyhow!("IndeedHub frontend failed: {}", err));
}
// Phase: WaitingHealthy → PostInstall → clear. The actual readiness
// gate is the package scanner's next sweep; this just gives the UI a
// truthful end-of-install signal so the bar settles at 95→100→done
// instead of sitting at "Queued… 2%" forever.
self.set_install_phase("indeedhub", InstallPhase::WaitingHealthy)
.await;
self.set_install_phase("indeedhub", InstallPhase::PostInstall)
.await;
self.set_install_phase("indeedhub", InstallPhase::Done).await;
self.clear_install_progress("indeedhub").await;
install_log("INSTALL OK: indeedhub stack").await;
info!("IndeedHub stack installed");
Ok(serde_json::json!({
@@ -965,3 +1148,20 @@ impl RpcHandler {
}))
}
}
#[cfg(test)]
mod tests {
use super::{btcpay_stack_app_ids, mempool_stack_app_ids};
#[test]
fn stack_app_id_sets_match_migration_manifests() {
assert_eq!(
btcpay_stack_app_ids(),
["archy-btcpay-db", "archy-nbxplorer", "btcpay-server"]
);
assert_eq!(
mempool_stack_app_ids(),
["archy-mempool-db", "mempool-api", "archy-mempool-web"]
);
}
}

View File

@@ -1,7 +1,7 @@
//! Per-app manual update handler.
//!
//! Flow: validate → set Updating state → graceful stop → pull new image(s) →
//! remove old container(s) → recreate via reconcile script → verify running.
//! remove old container(s) → recreate (orchestrator-first, legacy fallback) → verify running.
//! Data volumes are preserved (bind mounts, not stored in container).
use super::config::get_containers_for_app;
@@ -11,7 +11,7 @@ use super::runtime::stop_timeout_secs;
use super::validation::validate_app_id;
use crate::api::rpc::RpcHandler;
use crate::container::image_versions;
use crate::data_model::PackageState;
use crate::data_model::{InstallPhase, PackageState};
use anyhow::{Context, Result};
use tokio::io::{AsyncBufReadExt, BufReader};
use tracing::{error, info, warn};
@@ -34,15 +34,10 @@ impl RpcHandler {
let pinned = image_versions::pinned_image_for_app(package_id)
.ok_or_else(|| anyhow::anyhow!("No pinned image found for {}", package_id))?;
// Reject if already updating
{
let (data, _) = self.state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get(package_id) {
if entry.state == PackageState::Updating {
return Err(anyhow::anyhow!("{} is already updating", package_id));
}
}
}
// Note: the `already updating` guard lives in `spawn_package_update`
// (the async wrapper that dispatch actually routes to). By the time
// this inner function runs, the wrapper has already flipped state to
// `Updating`, so duplicating the check here would be a false positive.
install_log(&format!("UPDATE: {}{}", package_id, pinned)).await;
@@ -56,6 +51,64 @@ impl RpcHandler {
self.state_manager.update_data(data).await;
}
// Preferred path: for single-container apps managed by manifests, route
// updates through the orchestrator's upgrade lifecycle instead of the
// legacy shell/CLI flow. Keep stack-style packages on legacy for now.
if should_try_orchestrator_update(package_id, self.orchestrator.is_some()) {
let orchestrator_app_id = orchestrator_update_app_id(package_id);
self.set_install_phase(package_id, InstallPhase::Preparing)
.await;
install_log(&format!(
"UPDATE ORCH: {} — attempting orchestrator upgrade as {}",
package_id, orchestrator_app_id
))
.await;
if let Some(orchestrator) = self.orchestrator.as_ref() {
match orchestrator.upgrade(orchestrator_app_id).await {
Ok(()) => {
self.set_install_phase(package_id, InstallPhase::WaitingHealthy)
.await;
if let Ok(health) = orchestrator.health(orchestrator_app_id).await {
if health != "healthy" {
warn!(
"Update {}: orchestrator upgrade completed with health={} (expected healthy)",
package_id, health
);
}
}
install_log(&format!(
"UPDATE ORCH OK: {} (app={})",
package_id, orchestrator_app_id
))
.await;
self.clear_install_progress(package_id).await;
return Ok(serde_json::json!({
"status": "updated",
"package_id": package_id,
}));
}
Err(e) if is_unknown_app_id_error(&e) => {
info!(
"Update {}: orchestrator has no manifest mapping yet, falling back to legacy updater",
package_id
);
install_log(&format!(
"UPDATE ORCH SKIP: {} — unknown app_id, using legacy flow",
package_id
))
.await;
}
Err(e) => {
install_log(&format!("UPDATE ORCH FAIL: {}{}", package_id, e)).await;
self.clear_install_progress(package_id).await;
self.clear_update_state(package_id).await;
return Err(e.context(format!("Orchestrator update {} failed", package_id)));
}
}
}
}
// Resolve images to pull — either a stack or single container
let images_to_pull = self.resolve_images_to_pull(package_id, &pinned);
@@ -101,6 +154,11 @@ impl RpcHandler {
containers: &[String],
images_to_pull: &[(String, String)],
) -> Result<()> {
// Phase: Preparing — about to stop the running container(s) so
// we can swap images. Fast.
self.set_install_phase(package_id, InstallPhase::Preparing)
.await;
// 1. Graceful stop all containers (reverse order for dependencies)
info!(
"Update {}: stopping {} containers",
@@ -130,6 +188,10 @@ impl RpcHandler {
}
}
// Phase: PullingImage — about to fetch each pinned image in turn.
self.set_install_phase(package_id, InstallPhase::PullingImage)
.await;
// 2. Pull new images with progress
info!(
"Update {}: pulling {} images",
@@ -173,38 +235,23 @@ impl RpcHandler {
}
}
// 4. Recreate via reconcile script (single source of truth for container specs)
info!("Update {}: recreating containers via reconcile", package_id);
// Phase: CreatingContainer — about to recreate each container.
self.set_install_phase(package_id, InstallPhase::CreatingContainer)
.await;
// 4. Recreate containers (orchestrator-first, reconcile fallback)
info!("Update {}: recreating containers", package_id);
for name in containers {
let out = tokio::process::Command::new("bash")
.args([
"/opt/archipelago/scripts/reconcile-containers.sh",
&format!("--container={}", name),
"--force",
])
.output()
.await
.context(format!("Failed to reconcile {}", name))?;
if !out.status.success() {
let stderr = String::from_utf8_lossy(&out.stderr);
let stdout = String::from_utf8_lossy(&out.stdout);
error!(
"Update {}: reconcile {} failed:\nstdout: {}\nstderr: {}",
package_id,
name,
stdout.trim(),
stderr.trim()
);
return Err(anyhow::anyhow!(
"Reconcile failed for {}: {}",
name,
stderr.trim()
));
}
self.recreate_container_for_update(package_id, name).await?;
// Brief delay between containers for dependency initialization
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
}
// Phase: WaitingHealthy — reconcile has started every container,
// now verifying each reached running state.
self.set_install_phase(package_id, InstallPhase::WaitingHealthy)
.await;
// 5. Verify containers reached running state
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
for name in containers {
@@ -226,6 +273,51 @@ impl RpcHandler {
Ok(())
}
async fn recreate_container_for_update(
&self,
package_id: &str,
container_name: &str,
) -> Result<()> {
let Some(orchestrator) = self.orchestrator.as_ref() else {
return Err(anyhow::anyhow!(
"Cannot recreate {} during update {}: orchestrator unavailable",
container_name,
package_id
));
};
let mut attempted = Vec::new();
for app_id in candidate_app_ids_for_container(container_name) {
attempted.push(app_id.clone());
match orchestrator.install(&app_id).await {
Ok(created_name) => {
install_log(&format!(
"UPDATE ORCH RECREATE OK: {} — container={} app_id={} created={}",
package_id, container_name, app_id, created_name
))
.await;
return Ok(());
}
Err(e) if is_unknown_app_id_error(&e) => {
continue;
}
Err(e) => {
return Err(e.context(format!(
"orchestrator recreate failed for update {} (container={}, app_id={})",
package_id, container_name, app_id
)));
}
}
}
Err(anyhow::anyhow!(
"No manifest mapping found while recreating {} during update {} (attempted app_ids: {})",
container_name,
package_id,
attempted.join(", ")
))
}
/// Pull a single image with progress broadcasting (reuses install progress pattern).
async fn pull_update_image(&self, package_id: &str, image: &str) -> Result<()> {
self.set_install_progress(package_id, 0, 0).await;
@@ -296,15 +388,16 @@ impl RpcHandler {
Ok(o) => {
let stderr = String::from_utf8_lossy(&o.stderr);
warn!("Rollback: could not restart {}: {}", name, stderr.trim());
// Container was already removed — try reconcile to recreate with old image
let _ = tokio::process::Command::new("bash")
.args([
"/opt/archipelago/scripts/reconcile-containers.sh",
&format!("--container={}", name),
"--force",
])
.output()
.await;
// Container was already removed (forward path ran `podman rm`).
// Recreate via orchestrator-first path with legacy fallback.
if let Err(recreate_err) =
self.recreate_container_for_update(package_id, name).await
{
error!(
"Rollback: failed to recreate {} during rollback of {}: {}",
name, package_id, recreate_err
);
}
}
Err(e) => {
error!("Rollback: failed to restart {}: {}", name, e);
@@ -325,3 +418,129 @@ impl RpcHandler {
self.state_manager.update_data(data).await;
}
}
fn should_try_orchestrator_update(package_id: &str, orchestrator_available: bool) -> bool {
orchestrator_available && !uses_legacy_update_flow(package_id)
}
fn orchestrator_update_app_id(package_id: &str) -> &str {
match package_id {
"bitcoin-knots" => "bitcoin-core",
"electrs" | "mempool-electrs" => "electrumx",
_ => package_id,
}
}
fn uses_legacy_update_flow(package_id: &str) -> bool {
matches!(
package_id,
// Multi-container stacks still updated via the stack-aware path.
"immich" | "penpot" | "penpot-frontend" | "indeedhub"
)
}
fn is_unknown_app_id_error(err: &anyhow::Error) -> bool {
err.chain()
.any(|cause| cause.to_string().contains("unknown app_id"))
}
fn candidate_app_ids_for_container(container_name: &str) -> Vec<String> {
let mut out = Vec::new();
let mut push = |s: &str| {
if !out.iter().any(|e: &String| e == s) {
out.push(s.to_string());
}
};
match container_name {
"bitcoin-knots" | "bitcoin-core" => {
push("bitcoin-core");
push("bitcoin-knots");
}
"archy-bitcoin-ui" => push("bitcoin-ui"),
"archy-lnd-ui" => push("lnd-ui"),
"archy-electrs-ui" => push("electrs-ui"),
"mempool" => {
push("archy-mempool-web");
push("mempool");
}
_ => {}
}
push(container_name);
if let Some(stripped) = container_name.strip_prefix("archy-") {
push(stripped);
}
out
}
#[cfg(test)]
mod tests {
use super::{
candidate_app_ids_for_container, orchestrator_update_app_id,
should_try_orchestrator_update, uses_legacy_update_flow,
};
#[test]
fn legacy_flow_for_stack_apps() {
for app in ["immich", "penpot", "indeedhub"] {
assert!(uses_legacy_update_flow(app), "{app} should stay legacy");
}
}
#[test]
fn orchestrator_flow_for_single_apps() {
for app in [
"lnd",
"bitcoin-core",
"searxng",
"grafana",
"btcpay-server",
"mempool",
"fedimint",
] {
assert!(
!uses_legacy_update_flow(app),
"{app} should be orchestrator-first"
);
assert!(
should_try_orchestrator_update(app, true),
"{app} should use orchestrator when available"
);
}
}
#[test]
fn no_orchestrator_means_no_orchestrator_flow() {
assert!(!should_try_orchestrator_update("lnd", false));
assert!(!should_try_orchestrator_update("btcpay-server", false));
}
#[test]
fn container_name_candidates_cover_common_aliases() {
assert_eq!(
candidate_app_ids_for_container("bitcoin-knots"),
vec!["bitcoin-core", "bitcoin-knots"]
);
assert_eq!(
candidate_app_ids_for_container("archy-bitcoin-ui"),
vec!["bitcoin-ui", "archy-bitcoin-ui"]
);
assert_eq!(
candidate_app_ids_for_container("mempool"),
vec!["archy-mempool-web", "mempool"]
);
assert_eq!(
candidate_app_ids_for_container("archy-mempool-db"),
vec!["archy-mempool-db", "mempool-db"]
);
}
#[test]
fn update_aliases_map_to_manifest_app_ids() {
assert_eq!(orchestrator_update_app_id("bitcoin-knots"), "bitcoin-core");
assert_eq!(orchestrator_update_app_id("electrs"), "electrumx");
assert_eq!(orchestrator_update_app_id("mempool-electrs"), "electrumx");
assert_eq!(orchestrator_update_app_id("fedimint"), "fedimint");
}
}

View File

@@ -147,8 +147,7 @@ impl RpcHandler {
.get("onion")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing onion"))?;
let fips_npub =
crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
let reachable = node_message::check_peer_reachable(onion, fips_npub.as_deref())
.await
.unwrap_or(false);

View File

@@ -0,0 +1,172 @@
//! Async lifecycle helper for container Stop/Start/Restart RPCs.
//!
//! The `ContainerOrchestrator` trait is intentionally synchronous — blocking
//! calls keep the reconciler, boot flow, chaos harness, and unit tests
//! deterministic. But the RPC layer must return to the UI in <1s so the
//! dashboard can render a transitional "Stopping…" / "Starting…" label while
//! the underlying `podman stop` (up to 600s for bitcoin-core) runs in the
//! background.
//!
//! `RpcHandler::spawn_transitional` bridges the two: it
//! 1. flips the package state in `StateManager` to the appropriate
//! transitional variant (`Stopping` / `Starting` / `Restarting`),
//! which fans out to WebSocket clients immediately.
//! 2. `tokio::spawn`s the actual orchestrator call.
//! 3. on success, writes the final state (`Stopped` / `Running`).
//! 4. on error, reverts to the pre-transition state and logs via
//! `install_log()` so the incident shows up in
//! `/var/log/archipelago/container-installs.log`.
//!
//! The server.rs package-scan loop must also be taught to preserve
//! transitional states — see `server.rs:scan_and_update_packages`'s merge
//! logic and the companion `merge_preserving_transitional` helper.
use super::package::install_log;
use super::RpcHandler;
use crate::container::ContainerOrchestrator;
use crate::data_model::PackageState;
use crate::state::StateManager;
use anyhow::Result;
use std::sync::Arc;
use tracing::{error, info, warn};
/// The three transitional lifecycle operations that run asynchronously from
/// the RPC handler. `Install` and `Remove` are intentionally NOT here — they
/// already have their own progress-tracking paths (`install_progress`,
/// `uninstall_stage`) with multi-step UI feedback.
#[derive(Debug, Clone, Copy)]
pub(super) enum Op {
Stop,
Start,
Restart,
}
impl Op {
/// The `PackageState` to set on the entry while the operation is in
/// flight. The package-scan merge loop must preserve this variant and
/// refuse to overwrite it with whatever podman reports (see
/// `merge_preserving_transitional` in server.rs).
fn transitional_state(self) -> PackageState {
match self {
Op::Stop => PackageState::Stopping,
Op::Start => PackageState::Starting,
Op::Restart => PackageState::Restarting,
}
}
/// The `PackageState` to set on success. On error the caller reverts to
/// the pre-transition state rather than using these.
fn final_state_on_success(self) -> PackageState {
match self {
Op::Stop => PackageState::Stopped,
Op::Start => PackageState::Running,
Op::Restart => PackageState::Running,
}
}
/// Prefix used in `install_log` entries so post-mortem readers can grep
/// the operation that failed.
fn log_prefix(self) -> &'static str {
match self {
Op::Stop => "STOP",
Op::Start => "START",
Op::Restart => "RESTART",
}
}
/// Call the orchestrator for this op. Kept in one place so the spawned
/// task doesn't repeat the match four times.
async fn dispatch(self, orch: &dyn ContainerOrchestrator, app_id: &str) -> Result<()> {
match self {
Op::Stop => orch.stop(app_id).await,
Op::Start => orch.start(app_id).await,
Op::Restart => orch.restart(app_id).await,
}
}
}
impl RpcHandler {
/// Flip the package state to `op.transitional_state()`, spawn a background
/// task that runs `op.dispatch()`, and return immediately. The spawned
/// task writes the final state on completion or reverts to the
/// pre-transition state on failure.
///
/// If no package entry exists for `app_id` (e.g. Start on a container
/// that was never installed), no pre-state is recorded and the spawn
/// still runs — the post-success path will no-op the state write and
/// the next scan will pick up the newly-created entry with the correct
/// state. This keeps the helper usable for stacks that lazily create
/// their entries.
pub(super) async fn spawn_transitional(&self, op: Op, app_id: String) -> Result<()> {
let orchestrator = self
.orchestrator
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Container orchestrator not available"))?
.clone();
let state_manager = Arc::clone(&self.state_manager);
// Snapshot pre-transition state (for revert on error) and flip to
// transitional variant. Done BEFORE the spawn so the WebSocket push
// beats the RPC response — the UI should see "Stopping…" the moment
// it gets the RPC ok, not on the next scan.
let pre_state =
flip_to_transitional(&state_manager, &app_id, op.transitional_state()).await;
let log_prefix = op.log_prefix();
let app_id_log = app_id.clone();
install_log(&format!("{}: {}", log_prefix, app_id_log)).await;
tokio::spawn(async move {
match op.dispatch(orchestrator.as_ref(), &app_id).await {
Ok(()) => {
info!("{} complete: {}", log_prefix, app_id);
set_state(&state_manager, &app_id, op.final_state_on_success()).await;
}
Err(e) => {
error!("{} failed for {}: {:#}", log_prefix, app_id, e);
install_log(&format!("{} FAIL: {}{:#}", log_prefix, app_id, e)).await;
// Revert to pre-transition state if we had one; otherwise
// leave the entry untouched so the next scan reconciles.
if let Some(prev) = pre_state {
set_state(&state_manager, &app_id, prev).await;
} else {
warn!(
"{}: no pre-transition state recorded for {}; leaving entry to next scan",
log_prefix, app_id
);
}
}
}
});
Ok(())
}
}
/// Flip the entry's state to `transitional` and return the previous state.
/// Returns `None` if there is no entry for `app_id`.
async fn flip_to_transitional(
state_manager: &StateManager,
app_id: &str,
transitional: PackageState,
) -> Option<PackageState> {
let (mut data, _) = state_manager.get_snapshot().await;
let prev = data.package_data.get(app_id).map(|e| e.state.clone());
if let Some(entry) = data.package_data.get_mut(app_id) {
entry.state = transitional;
state_manager.update_data(data).await;
}
prev
}
/// Set the entry's state to `new_state`. No-ops if the entry has since been
/// removed (e.g. uninstall ran concurrently).
async fn set_state(state_manager: &StateManager, app_id: &str, new_state: PackageState) {
let (mut data, _) = state_manager.get_snapshot().await;
if let Some(entry) = data.package_data.get_mut(app_id) {
if entry.state != new_state {
entry.state = new_state;
state_manager.update_data(data).await;
}
}
}

View File

@@ -162,10 +162,8 @@ impl RpcHandler {
// progress bar after navigation instead of showing the fake
// creep again. An RPC poll every ~1s during download drives a
// real progress indicator that survives route changes.
let downloaded = update::DOWNLOAD_BYTES
.load(std::sync::atomic::Ordering::Relaxed);
let total = update::DOWNLOAD_TOTAL
.load(std::sync::atomic::Ordering::Relaxed);
let downloaded = update::DOWNLOAD_BYTES.load(std::sync::atomic::Ordering::Relaxed);
let total = update::DOWNLOAD_TOTAL.load(std::sync::atomic::Ordering::Relaxed);
let active = total > 0 && downloaded < total;
let completed = total > 0 && downloaded >= total;
@@ -175,8 +173,7 @@ impl RpcHandler {
// read timeout). The UI uses this to surface a Cancel button
// with explanatory copy.
let stalled = if active {
let last_at = update::DOWNLOAD_PROGRESS_AT
.load(std::sync::atomic::Ordering::Relaxed);
let last_at = update::DOWNLOAD_PROGRESS_AT.load(std::sync::atomic::Ordering::Relaxed);
if last_at > 0 {
let now = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)

View File

@@ -18,12 +18,12 @@ use base64::Engine;
/// Convert a byte to an HSL triple biased toward readable foregrounds on
/// dark backgrounds (saturation 6085%, lightness 5270%).
fn hue_color(seed: u8) -> String {
let hue = (seed as u16) * 360 / 256;
let hue = (seed as u32) * 360 / 256;
format!("hsl({}, 72%, 60%)", hue)
}
fn accent_color(seed: u8) -> String {
let hue = (seed as u16) * 360 / 256;
let hue = (seed as u32) * 360 / 256;
format!("hsl({}, 80%, 68%)", hue)
}
@@ -37,7 +37,10 @@ fn encode_svg(svg: &str) -> String {
/// avatar rather than an error.
fn seed_bytes(pubkey_hex: &str) -> [u8; 8] {
let mut out = [0u8; 8];
let clean: String = pubkey_hex.chars().filter(|c| c.is_ascii_hexdigit()).collect();
let clean: String = pubkey_hex
.chars()
.filter(|c| c.is_ascii_hexdigit())
.collect();
for (i, byte) in out.iter_mut().enumerate() {
let lo = i * 2;
if clean.len() >= lo + 2 {

View File

@@ -24,8 +24,7 @@ use crate::update::host_sudo;
const DOCTOR_SH: &str = include_str!("../../../scripts/container-doctor.sh");
const DOCTOR_SERVICE: &str =
include_str!("../../../image-recipe/configs/archipelago-doctor.service");
const DOCTOR_TIMER: &str =
include_str!("../../../image-recipe/configs/archipelago-doctor.timer");
const DOCTOR_TIMER: &str = include_str!("../../../image-recipe/configs/archipelago-doctor.timer");
const DOCTOR_SH_PATH: &str = "/home/archipelago/archy/scripts/container-doctor.sh";
const DOCTOR_SERVICE_PATH: &str = "/etc/systemd/system/archipelago-doctor.service";
@@ -110,8 +109,8 @@ async fn run() -> Result<bool> {
if let Err(e) = host_sudo(&["systemctl", "daemon-reload"]).await {
warn!("daemon-reload failed: {:#}", e);
}
if let Err(e) = host_sudo(&["systemctl", "enable", "--now", "archipelago-doctor.timer"])
.await
if let Err(e) =
host_sudo(&["systemctl", "enable", "--now", "archipelago-doctor.timer"]).await
{
warn!("enable archipelago-doctor.timer failed: {:#}", e);
} else if !timer_enabled {
@@ -188,10 +187,7 @@ async fn run_nginx() -> Result<bool> {
}
if !Path::new(NGINX_CONF_PATH).exists() {
debug!(
"{} missing — skipping nginx bootstrap",
NGINX_CONF_PATH
);
debug!("{} missing — skipping nginx bootstrap", NGINX_CONF_PATH);
return Ok(false);
}

View File

@@ -0,0 +1,298 @@
//! bitcoin-ui nginx.conf renderer.
//!
//! Step 7 of the rust-orchestrator migration. Replaces the old
//! `sed -i __BITCOIN_RPC_AUTH__` approach from `first-boot-containers.sh`
//! (which destructively overwrote its own template, broke on rotation,
//! and had no story for dual Knots/Core UIs) with a binary-embedded
//! template rendered at install/reconcile time and atomic-written to
//! disk.
//!
//! The manifest bind-mounts the rendered file read-only into the
//! container at `/etc/nginx/conf.d/default.conf`. On every reconcile
//! pass we re-render and compare — if the rendered bytes would differ
//! from what's on disk (password rotated, template changed via OTA),
//! we rewrite atomically and the reconciler restarts the container.
//!
//! Source of truth:
//! * RPC user: hardcoded `archipelago` (matches the image's `bitcoin.conf`).
//! * RPC password: `/var/lib/archipelago/secrets/bitcoin-rpc-password`,
//! plaintext, written by the seed-derived credential setup.
//!
//! Both Knots and Core back-ends expose RPC on 127.0.0.1:8332 with the
//! same auth shape, so one template serves both.
use anyhow::{Context, Result};
use base64::Engine;
use sha2::{Digest, Sha256};
use std::path::{Path, PathBuf};
use std::sync::atomic::{AtomicU64, Ordering};
use std::time::{SystemTime, UNIX_EPOCH};
use tokio::fs;
/// The nginx.conf template. Embedded at compile time so it can never
/// drift from the code that renders it, and ships atomically with OTA.
///
/// `{{BITCOIN_RPC_AUTH}}` is the only placeholder — replaced with a
/// `base64(user:password)` blob at render time.
pub(crate) const TEMPLATE: &str = include_str!("bitcoin_ui_nginx.conf.template");
/// The single placeholder in `TEMPLATE`.
const PLACEHOLDER: &str = "{{BITCOIN_RPC_AUTH}}";
/// Hardcoded RPC user. Matches the user written into `bitcoin.conf` by
/// the bitcoin-core/bitcoin-knots bootstrap, and the legacy
/// `BITCOIN_RPC_USER="archipelago"` from `first-boot-containers.sh`.
const RPC_USER: &str = "archipelago";
/// Default path to the plaintext RPC password secret.
///
/// Written by the seed-derived credential flow; same file the bash
/// scripts read today at `first-boot-containers.sh:277` and `:1225`.
pub const DEFAULT_SECRET_PATH: &str = "/var/lib/archipelago/secrets/bitcoin-rpc-password";
/// Default output path for the rendered nginx.conf.
///
/// The manifest bind-mounts this file read-only into the bitcoin-ui
/// container at `/etc/nginx/conf.d/default.conf`.
pub const DEFAULT_RENDERED_PATH: &str = "/var/lib/archipelago/bitcoin-ui/nginx.conf";
/// Parameters for rendering. Injectable so tests can hit a tmpdir
/// instead of `/var/lib/archipelago`.
#[derive(Debug, Clone)]
pub struct RenderPaths {
/// Path to read the plaintext RPC password from.
pub secret_path: PathBuf,
/// Path to write the rendered nginx.conf to.
pub rendered_path: PathBuf,
}
impl Default for RenderPaths {
fn default() -> Self {
Self {
secret_path: PathBuf::from(DEFAULT_SECRET_PATH),
rendered_path: PathBuf::from(DEFAULT_RENDERED_PATH),
}
}
}
/// Outcome of a render pass. `Written` if the rendered bytes differed
/// from the current on-disk contents and we rewrote; `Unchanged` if
/// they matched and we left the file alone.
///
/// The caller (reconciler / install path) decides whether to restart
/// the bitcoin-ui container based on this.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum RenderOutcome {
Written,
Unchanged,
}
/// Render the bitcoin-ui nginx.conf and atomic-write it to disk if it
/// differs from what's already there.
///
/// Idempotent: safe to call on every reconcile pass. Does a byte
/// comparison before writing so an unchanged password + template is a
/// no-op (no inode churn, no container restart cascade).
///
/// Errors if the secret file is missing or empty. Upstream callers
/// treat that as "bitcoin-ui isn't installable yet" rather than fatal
/// — the RPC password comes into being during bitcoin-core's own
/// bootstrap, which may not have happened yet on a fresh node.
pub async fn render(paths: &RenderPaths) -> Result<RenderOutcome> {
let password = read_password(&paths.secret_path).await?;
let auth_b64 = encode_basic_auth(RPC_USER, &password);
let rendered = TEMPLATE.replace(PLACEHOLDER, &auth_b64);
// Compare against existing. read-to-string fails on ENOENT (first
// install) — treat as "different".
let existing = fs::read_to_string(&paths.rendered_path).await.ok();
if existing.as_deref() == Some(rendered.as_str()) {
return Ok(RenderOutcome::Unchanged);
}
// Atomic write: write to sibling tmp + rename. Keeps the bind-
// mounted file pointing at a fully-formed config at all times.
let parent = paths
.rendered_path
.parent()
.ok_or_else(|| anyhow::anyhow!("rendered_path has no parent directory"))?;
fs::create_dir_all(parent)
.await
.with_context(|| format!("creating {}", parent.display()))?;
let tmp = unique_tmp_path(&paths.rendered_path);
fs::write(&tmp, &rendered)
.await
.with_context(|| format!("writing tmp {}", tmp.display()))?;
fs::rename(&tmp, &paths.rendered_path)
.await
.with_context(|| {
format!(
"renaming {} -> {}",
tmp.display(),
paths.rendered_path.display()
)
})?;
tracing::info!(
path = %paths.rendered_path.display(),
auth_hash = %short_hash(&auth_b64),
"bitcoin-ui nginx.conf rendered"
);
Ok(RenderOutcome::Written)
}
fn unique_tmp_path(dest: &Path) -> PathBuf {
static COUNTER: AtomicU64 = AtomicU64::new(0);
let n = COUNTER.fetch_add(1, Ordering::Relaxed);
let ts = SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|d| d.as_nanos())
.unwrap_or(0);
dest.with_extension(format!("tmp.{ts}.{n}"))
}
/// Read the plaintext RPC password from disk. Trims trailing newlines
/// (common from `echo "$PASS" > file`) but rejects an empty result.
async fn read_password(path: &Path) -> Result<String> {
let raw = fs::read_to_string(path)
.await
.with_context(|| format!("reading bitcoin RPC password from {}", path.display()))?;
let trimmed = raw.trim().to_string();
if trimmed.is_empty() {
anyhow::bail!(
"bitcoin RPC password file {} is empty — bitcoin-core bootstrap hasn't written it yet",
path.display()
);
}
Ok(trimmed)
}
/// `base64("user:password")` — the value nginx puts after `Basic ` in
/// the upstream `Authorization` header.
fn encode_basic_auth(user: &str, password: &str) -> String {
let raw = format!("{user}:{password}");
base64::engine::general_purpose::STANDARD.encode(raw.as_bytes())
}
/// Short hash of the auth value for logging — we never want the
/// plaintext or full base64 in logs (it's a credential), but a stable
/// fingerprint helps correlate rotations.
fn short_hash(s: &str) -> String {
let mut hasher = Sha256::new();
hasher.update(s.as_bytes());
let digest = hasher.finalize();
hex::encode(&digest[..4])
}
#[cfg(test)]
mod tests {
use super::*;
use tempfile::TempDir;
fn paths_in(dir: &Path, password: &str) -> RenderPaths {
let secret = dir.join("bitcoin-rpc-password");
std::fs::write(&secret, password).unwrap();
RenderPaths {
secret_path: secret,
rendered_path: dir.join("nginx.conf"),
}
}
#[tokio::test]
async fn render_writes_file_with_substitution() {
let tmp = TempDir::new().unwrap();
let paths = paths_in(tmp.path(), "hunter2");
let outcome = render(&paths).await.unwrap();
assert_eq!(outcome, RenderOutcome::Written);
let contents = std::fs::read_to_string(&paths.rendered_path).unwrap();
// archipelago:hunter2 -> "YXJjaGlwZWxhZ286aHVudGVyMg=="
assert!(
contents.contains("YXJjaGlwZWxhZ286aHVudGVyMg=="),
"base64 auth not found in rendered config:\n{contents}"
);
assert!(
!contents.contains(PLACEHOLDER),
"placeholder left in output"
);
}
#[tokio::test]
async fn render_is_idempotent_when_password_unchanged() {
let tmp = TempDir::new().unwrap();
let paths = paths_in(tmp.path(), "hunter2");
let first = render(&paths).await.unwrap();
assert_eq!(first, RenderOutcome::Written);
let second = render(&paths).await.unwrap();
assert_eq!(second, RenderOutcome::Unchanged);
}
#[tokio::test]
async fn render_rewrites_on_password_rotation() {
let tmp = TempDir::new().unwrap();
let paths = paths_in(tmp.path(), "old-pass");
render(&paths).await.unwrap();
// Rotate.
std::fs::write(&paths.secret_path, "new-pass").unwrap();
let outcome = render(&paths).await.unwrap();
assert_eq!(outcome, RenderOutcome::Written);
let contents = std::fs::read_to_string(&paths.rendered_path).unwrap();
// archipelago:new-pass -> "YXJjaGlwZWxhZ286bmV3LXBhc3M="
assert!(contents.contains("YXJjaGlwZWxhZ286bmV3LXBhc3M="));
}
#[tokio::test]
async fn render_trims_trailing_newline_from_secret() {
// Matches `echo "$PASS" > file` behaviour.
let tmp = TempDir::new().unwrap();
let paths = paths_in(tmp.path(), "hunter2\n");
render(&paths).await.unwrap();
let contents = std::fs::read_to_string(&paths.rendered_path).unwrap();
assert!(
contents.contains("YXJjaGlwZWxhZ286aHVudGVyMg=="),
"trailing newline should be stripped before encoding"
);
}
#[tokio::test]
async fn render_errors_on_empty_password() {
let tmp = TempDir::new().unwrap();
let paths = paths_in(tmp.path(), "");
let err = render(&paths).await.unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("empty"), "unexpected error: {msg}");
}
#[tokio::test]
async fn render_errors_when_secret_missing() {
let tmp = TempDir::new().unwrap();
let paths = RenderPaths {
secret_path: tmp.path().join("does-not-exist"),
rendered_path: tmp.path().join("nginx.conf"),
};
let err = render(&paths).await.unwrap_err();
let msg = format!("{err}");
assert!(
msg.contains("reading bitcoin RPC password"),
"unexpected error: {msg}"
);
}
#[test]
fn template_contains_exactly_one_placeholder() {
// Safety net: if someone adds a second placeholder to the
// template without updating the renderer, we want a test to
// fail loudly rather than ship a half-substituted config.
let count = TEMPLATE.matches(PLACEHOLDER).count();
assert_eq!(count, 1, "template must contain exactly one {PLACEHOLDER}");
}
#[test]
fn template_proxies_bitcoin_rpc_on_8332() {
// Lock in the core shape so a bad template edit doesn't ship.
assert!(TEMPLATE.contains("proxy_pass http://127.0.0.1:8332/"));
assert!(TEMPLATE.contains("location /bitcoin-rpc/"));
assert!(TEMPLATE.contains("listen 8334"));
}
}

View File

@@ -9,7 +9,7 @@ server {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Authorization "Basic __BITCOIN_RPC_AUTH__";
proxy_set_header Authorization "Basic {{BITCOIN_RPC_AUTH}}";
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Allow-Methods "POST, GET, OPTIONS";
add_header Access-Control-Allow-Headers "Content-Type, Authorization";

View File

@@ -0,0 +1,331 @@
//! BootReconciler — the long-running task that keeps the prod orchestrator's
//! desired-state view in lockstep with what podman actually has.
//!
//! Step 5 of the rust-orchestrator migration. Spawned once from `main.rs`
//! (Step 6) after the initial `adopt_existing()` pass. Every `interval` it
//! calls `ProdContainerOrchestrator::reconcile_all()`, which ensures every
//! loaded manifest has a running container, installing fresh ones as needed.
//!
//! Per answered design Q3, `interval` defaults to 30 seconds.
//!
//! Shutdown is signalled via `Arc<Notify>`. The reconciler finishes its
//! current `reconcile_all` call before exiting — we don't interrupt an
//! in-flight pull or build.
//!
//! See `docs/rust-orchestrator-migration.md` §269-352.
use std::sync::Arc;
use std::time::Duration;
use tokio::sync::Notify;
use tokio::time::{self, Instant};
use crate::container::prod_orchestrator::{ProdContainerOrchestrator, ReconcileReport};
/// Default reconciler cadence (answered design Q3).
pub const DEFAULT_INTERVAL: Duration = Duration::from_secs(30);
pub struct BootReconciler {
orchestrator: Arc<ProdContainerOrchestrator>,
interval: Duration,
shutdown: Arc<Notify>,
}
impl BootReconciler {
pub fn new(
orchestrator: Arc<ProdContainerOrchestrator>,
interval: Duration,
shutdown: Arc<Notify>,
) -> Self {
Self {
orchestrator,
interval,
shutdown,
}
}
/// Run the reconcile loop until `shutdown` is notified.
///
/// Does one reconcile immediately, then sleeps `interval` between
/// subsequent passes. A `shutdown.notify_one()` call unblocks the sleep
/// and the task returns after the *next* pass completes.
///
/// Never panics: per-app failures are absorbed into `ReconcileReport` by
/// the orchestrator, and `reconcile_all` itself returns infallibly.
pub async fn run_forever(self) {
// Initial pass: no delay.
let report = self.orchestrator.reconcile_all().await;
Self::log_report(&report);
loop {
let deadline = Instant::now() + self.interval;
tokio::select! {
_ = time::sleep_until(deadline) => {
let report = self.orchestrator.reconcile_all().await;
Self::log_report(&report);
}
_ = self.shutdown.notified() => {
tracing::info!("boot reconciler: shutdown requested, exiting loop");
break;
}
}
}
}
fn log_report(report: &ReconcileReport) {
for (app_id, action) in &report.actions {
tracing::debug!(app_id = %app_id, action = ?action, "reconcile action");
}
for (app_id, err) in &report.failures {
tracing::warn!(app_id = %app_id, error = %err, "reconcile failure");
}
if report.failures.is_empty() {
tracing::debug!(count = report.actions.len(), "reconcile pass complete");
} else {
tracing::warn!(
ok = report.actions.len(),
failed = report.failures.len(),
"reconcile pass completed with failures"
);
}
}
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
#[cfg(test)]
mod tests {
use super::*;
use crate::container::prod_orchestrator::ProdContainerOrchestrator;
use anyhow::Result;
use archipelago_container::{
AppManifest, BuildConfig, ContainerRuntime as ContainerRuntimeTrait, ContainerState,
ContainerStatus,
};
use async_trait::async_trait;
use std::collections::HashMap;
use std::path::PathBuf;
use std::sync::Mutex as StdMutex;
/// Instrumented runtime that counts reconcile-loop side effects so tests
/// can tell exactly how many passes have fired. All containers are
/// reported as already Running so `reconcile_all` will NoOp — we are only
/// measuring loop cadence, not install behavior.
#[derive(Default)]
struct CountingRuntime {
/// Number of times get_container_status has been called. Each
/// reconcile_all pass hits this once per manifest, so with one
/// manifest this equals the number of reconcile passes.
status_calls: StdMutex<u32>,
running: StdMutex<HashMap<String, ContainerState>>,
}
impl CountingRuntime {
fn new_with(names: &[&str]) -> Self {
let me = Self::default();
let mut m = me.running.lock().unwrap();
for n in names {
m.insert((*n).to_string(), ContainerState::Running);
}
drop(m);
me
}
fn status_call_count(&self) -> u32 {
*self.status_calls.lock().unwrap()
}
}
#[async_trait]
impl ContainerRuntimeTrait for CountingRuntime {
async fn pull_image(&self, _: &str, _: Option<&str>) -> Result<()> {
Ok(())
}
async fn create_container(&self, _: &AppManifest, name: &str, _: u16) -> Result<String> {
Ok(name.to_string())
}
async fn start_container(&self, _: &str) -> Result<()> {
Ok(())
}
async fn stop_container(&self, _: &str) -> Result<()> {
Ok(())
}
async fn remove_container(&self, _: &str) -> Result<()> {
Ok(())
}
async fn get_container_status(&self, name: &str) -> Result<ContainerStatus> {
*self.status_calls.lock().unwrap() += 1;
let state = self
.running
.lock()
.unwrap()
.get(name)
.cloned()
.ok_or_else(|| anyhow::anyhow!("not found: {name}"))?;
Ok(ContainerStatus {
id: format!("id-{name}"),
name: name.to_string(),
state,
health: None,
exit_code: None,
started_at: None,
image: "test".into(),
created: "now".into(),
ports: vec![],
lan_address: None,
})
}
async fn get_container_logs(&self, _: &str, _: u32) -> Result<Vec<String>> {
Ok(vec![])
}
async fn list_containers(&self) -> Result<Vec<ContainerStatus>> {
Ok(vec![])
}
async fn image_exists(&self, _: &str) -> Result<bool> {
Ok(true)
}
async fn build_image(&self, _: &BuildConfig) -> Result<()> {
Ok(())
}
}
fn pull_manifest(id: &str, image: &str) -> AppManifest {
let yaml = format!(
"app:\n id: {id}\n name: {id}\n version: 1.0.0\n container:\n image: {image}\n"
);
AppManifest::parse(&yaml).unwrap()
}
async fn orch_with_one_running_manifest(
rt: Arc<CountingRuntime>,
) -> Arc<ProdContainerOrchestrator> {
let orch = Arc::new(ProdContainerOrchestrator::with_runtime(
rt,
PathBuf::from("/nonexistent-for-tests"),
));
orch.insert_manifest_for_test(
pull_manifest("bitcoin-knots", "docker.io/bitcoin/knots:28"),
PathBuf::from("/tmp/bk"),
)
.await;
orch
}
#[tokio::test(start_paused = true)]
async fn initial_pass_fires_immediately() {
let rt = Arc::new(CountingRuntime::new_with(&["bitcoin-knots"]));
let orch = orch_with_one_running_manifest(rt.clone()).await;
let shutdown = Arc::new(Notify::new());
let reconciler =
BootReconciler::new(orch.clone(), Duration::from_secs(30), shutdown.clone());
let handle = tokio::spawn(reconciler.run_forever());
// Yield so the spawned task gets CPU to run its initial reconcile.
tokio::task::yield_now().await;
tokio::task::yield_now().await;
// We expect exactly one reconcile pass to have run by now (the initial),
// NOT a second one (the 30s sleep hasn't elapsed in paused time).
assert_eq!(rt.status_call_count(), 1, "initial pass should fire once");
shutdown.notify_one();
// Under paused clock the select! is blocked on sleep_until; the notify
// will unblock it. Advance wall-clock a hair so the notify gets polled.
tokio::task::yield_now().await;
let _ = tokio::time::timeout(Duration::from_secs(1), handle).await;
}
#[tokio::test(start_paused = true)]
async fn second_pass_fires_after_interval() {
let rt = Arc::new(CountingRuntime::new_with(&["bitcoin-knots"]));
let orch = orch_with_one_running_manifest(rt.clone()).await;
let shutdown = Arc::new(Notify::new());
let reconciler =
BootReconciler::new(orch.clone(), Duration::from_secs(30), shutdown.clone());
let handle = tokio::spawn(reconciler.run_forever());
tokio::task::yield_now().await;
tokio::task::yield_now().await;
assert_eq!(rt.status_call_count(), 1);
// Fast-forward past one interval; the sleep_until should fire.
tokio::time::advance(Duration::from_secs(31)).await;
tokio::task::yield_now().await;
tokio::task::yield_now().await;
assert_eq!(
rt.status_call_count(),
2,
"a second reconcile pass should fire after one interval"
);
shutdown.notify_one();
tokio::task::yield_now().await;
let _ = tokio::time::timeout(Duration::from_secs(1), handle).await;
}
#[tokio::test(start_paused = true)]
async fn shutdown_terminates_loop() {
let rt = Arc::new(CountingRuntime::new_with(&["bitcoin-knots"]));
let orch = orch_with_one_running_manifest(rt.clone()).await;
let shutdown = Arc::new(Notify::new());
let reconciler =
BootReconciler::new(orch.clone(), Duration::from_secs(30), shutdown.clone());
let handle = tokio::spawn(reconciler.run_forever());
tokio::task::yield_now().await;
tokio::task::yield_now().await;
shutdown.notify_one();
// The select! should wake on Notified and return. Use a real timeout
// with advancing the paused clock to make sure the task exits.
tokio::time::advance(Duration::from_millis(10)).await;
let result = tokio::time::timeout(Duration::from_secs(5), handle).await;
assert!(result.is_ok(), "reconciler did not exit after shutdown");
}
#[tokio::test(start_paused = true)]
async fn failure_in_one_pass_does_not_stop_loop() {
// Manifest references a container the runtime does not have AND
// cannot create (no install path — install_fresh will also fail to
// pull, since CountingRuntime::pull_image returns Ok but the
// manifest's referenced container stays uncreated). In practice
// reconcile_all will observe the missing container, install_fresh
// will run, and the next pass will see a new state. We care about
// "loop keeps ticking even when the report has actions".
let rt = Arc::new(CountingRuntime::default());
let orch = Arc::new(ProdContainerOrchestrator::with_runtime(
rt.clone(),
PathBuf::from("/nonexistent-for-tests"),
));
orch.insert_manifest_for_test(
pull_manifest("bitcoin-knots", "docker.io/bitcoin/knots:28"),
PathBuf::from("/tmp/bk"),
)
.await;
let shutdown = Arc::new(Notify::new());
let reconciler =
BootReconciler::new(orch.clone(), Duration::from_secs(30), shutdown.clone());
let handle = tokio::spawn(reconciler.run_forever());
tokio::task::yield_now().await;
tokio::task::yield_now().await;
let first = rt.status_call_count();
assert!(first >= 1, "initial pass should have touched the runtime");
// Advance one interval — second pass should fire regardless of what
// the first pass did.
tokio::time::advance(Duration::from_secs(31)).await;
tokio::task::yield_now().await;
tokio::task::yield_now().await;
let second = rt.status_call_count();
assert!(
second > first,
"loop should have fired a second pass after the interval"
);
shutdown.notify_one();
tokio::time::advance(Duration::from_millis(10)).await;
let _ = tokio::time::timeout(Duration::from_secs(5), handle).await;
}
}

View File

@@ -1,12 +1,15 @@
use anyhow::{Context, Result};
use archipelago_container::{
AppManifest, BitcoinSimulationMode, BitcoinSimulator,
ContainerRuntime as ContainerRuntimeTrait, ContainerStatus, PortManager,
ContainerRuntime as ContainerRuntimeTrait, ContainerStatus, PortManager, ResolvedSource,
};
use async_trait::async_trait;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use crate::config::{BitcoinSimulation, Config, ContainerRuntime};
use crate::container::data_manager::DevDataManager;
use crate::container::traits::ContainerOrchestrator;
pub struct DevContainerOrchestrator {
runtime: Arc<dyn ContainerRuntimeTrait>,
@@ -103,14 +106,27 @@ impl DevContainerOrchestrator {
volume.source = dev_path.to_string_lossy().to_string();
}
// Pull image
self.runtime
.pull_image(
&manifest.app.container.image,
manifest.app.container.image_signature.as_deref(),
)
.await
.context("Failed to pull image")?;
// Resolve pull-or-build. Dev orchestrator currently only supports pull;
// Build support lands in Step 2 of the rust-orchestrator migration.
match manifest.app.container.resolve().ok_or_else(|| {
anyhow::anyhow!("manifest container config invalid (neither image nor build)")
})? {
ResolvedSource::Pull {
image,
image_signature,
..
} => {
self.runtime
.pull_image(&image, image_signature.as_deref())
.await
.context("Failed to pull image")?;
}
ResolvedSource::Build(_) => {
anyhow::bail!(
"dev orchestrator does not yet support local image builds (see rust-orchestrator-migration.md Step 2)"
);
}
}
// Create container with port offset
let port_offset = if self.config.dev_mode {
@@ -242,4 +258,153 @@ impl DevContainerOrchestrator {
archipelago_container::ContainerState::Unknown(_) => Ok("unknown".to_string()),
}
}
/// Load a manifest for `app_id` from the dev-mode apps directory.
///
/// Used by the trait-level `install(app_id)` entry point.
///
/// Search order intentionally mirrors production/operator reality:
/// 1) `$ARCHIPELAGO_APPS_DIR` (explicit override)
/// 2) `/opt/archipelago/apps` (image-recipe canonical path)
/// 3) `/home/archipelago/Projects/archy/apps` (repo-local fallback on dev nodes)
/// 4) `<data_dir>/apps` (legacy dev layout)
async fn load_manifest_for(&self, app_id: &str) -> Result<AppManifest> {
let candidates = candidate_manifest_paths(app_id, &self.config.data_dir);
let mut last_err: Option<anyhow::Error> = None;
for path in candidates {
let content = match tokio::fs::read_to_string(&path).await {
Ok(c) => c,
Err(e) => {
last_err = Some(e.into());
continue;
}
};
let manifest: AppManifest = serde_yaml::from_str(&content)
.with_context(|| format!("parsing manifest {}", path.display()))?;
return Ok(manifest);
}
let msg = format!(
"manifest for {} not found in any search path (set ARCHIPELAGO_APPS_DIR or install /opt/archipelago/apps)",
app_id
);
Err(match last_err {
Some(e) => e.context(msg),
None => anyhow::anyhow!(msg),
})
}
}
fn candidate_manifest_paths(app_id: &str, data_dir: &Path) -> Vec<PathBuf> {
let mut roots: Vec<PathBuf> = Vec::new();
if let Ok(v) = std::env::var("ARCHIPELAGO_APPS_DIR") {
let v = v.trim();
if !v.is_empty() {
roots.push(PathBuf::from(v));
}
}
roots.push(PathBuf::from("/opt/archipelago/apps"));
roots.push(PathBuf::from("/home/archipelago/Projects/archy/apps"));
roots.push(data_dir.join("apps"));
let mut deduped: Vec<PathBuf> = Vec::new();
for root in roots {
if !deduped.iter().any(|p| p == &root) {
deduped.push(root);
}
}
deduped
.into_iter()
.map(|root| root.join(app_id).join("manifest.yml"))
.collect()
}
// ---------------------------------------------------------------------------
// Trait impl (Step 4): expose the shared ContainerOrchestrator surface.
// Forwards to the inherent methods, which internally apply the `-dev` suffix
// and the port offset. The trait keeps the RPC layer mode-agnostic; Dev's
// install_container (manifest_path-based) stays as an inherent method for the
// ad-hoc dev-mode RPC and is not exposed on the trait.
// ---------------------------------------------------------------------------
#[async_trait]
impl ContainerOrchestrator for DevContainerOrchestrator {
async fn install(&self, app_id: &str) -> Result<String> {
let manifest = self.load_manifest_for(app_id).await?;
let name = self.install_container(&manifest, "").await?;
Ok(name)
}
async fn start(&self, app_id: &str) -> Result<()> {
self.start_container(app_id).await
}
async fn stop(&self, app_id: &str) -> Result<()> {
self.stop_container(app_id).await
}
async fn restart(&self, app_id: &str) -> Result<()> {
let _ = self.stop_container(app_id).await;
self.start_container(app_id).await
}
async fn remove(&self, app_id: &str, preserve_data: bool) -> Result<()> {
self.remove_container(app_id, preserve_data).await
}
async fn upgrade(&self, app_id: &str) -> Result<()> {
// Dev upgrade: stop, remove (preserving data), re-install from the loaded manifest.
let _ = self.stop_container(app_id).await;
let _ = self.remove_container(app_id, true).await;
let manifest = self.load_manifest_for(app_id).await?;
self.install_container(&manifest, "").await?;
self.start_container(app_id).await
}
async fn status(&self, app_id: &str) -> Result<ContainerStatus> {
self.get_container_status(app_id).await
}
async fn list(&self) -> Result<Vec<ContainerStatus>> {
self.list_containers().await
}
async fn logs(&self, app_id: &str, lines: u32) -> Result<Vec<String>> {
self.get_container_logs(app_id, lines).await
}
async fn health(&self, app_id: &str) -> Result<String> {
self.get_health_status(app_id).await
}
}
#[cfg(test)]
mod tests {
use super::candidate_manifest_paths;
use std::path::PathBuf;
#[test]
fn candidate_manifest_paths_include_expected_fallbacks() {
let app_id = "bitcoin-ui";
let paths = candidate_manifest_paths(app_id, &PathBuf::from("/var/lib/archipelago"));
let as_strings: Vec<String> = paths
.iter()
.map(|p| p.to_string_lossy().into_owned())
.collect();
assert!(as_strings
.iter()
.any(|p| p == "/opt/archipelago/apps/bitcoin-ui/manifest.yml"));
assert!(as_strings
.iter()
.any(|p| p == "/home/archipelago/Projects/archy/apps/bitcoin-ui/manifest.yml"));
assert!(as_strings
.iter()
.any(|p| p == "/var/lib/archipelago/apps/bitcoin-ui/manifest.yml"));
}
}

View File

@@ -0,0 +1,118 @@
//! filebrowser config bootstrap helper.
//!
//! Mirrors the legacy first-boot behavior that writes
//! `/var/lib/archipelago/filebrowser-data/.filebrowser.json` before
//! starting the container with `--config /data/.filebrowser.json`.
use anyhow::{Context, Result};
use std::path::PathBuf;
use tokio::fs;
pub const DEFAULT_SRV_ROOT: &str = "/var/lib/archipelago/filebrowser";
pub const DEFAULT_DATA_DIR: &str = "/var/lib/archipelago/filebrowser-data";
pub const DEFAULT_CONFIG_PATH: &str = "/var/lib/archipelago/filebrowser-data/.filebrowser.json";
const DEFAULT_CONFIG_JSON: &str =
"{\"port\":80,\"baseURL\":\"\",\"address\":\"0.0.0.0\",\"database\":\"/data/filebrowser.db\",\"root\":\"/srv\",\"log\":\"stdout\"}\n";
#[derive(Debug, Clone)]
pub struct EnsurePaths {
pub srv_root: PathBuf,
pub data_dir: PathBuf,
pub config_path: PathBuf,
}
impl Default for EnsurePaths {
fn default() -> Self {
Self {
srv_root: PathBuf::from(DEFAULT_SRV_ROOT),
data_dir: PathBuf::from(DEFAULT_DATA_DIR),
config_path: PathBuf::from(DEFAULT_CONFIG_PATH),
}
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum EnsureOutcome {
Written,
Unchanged,
}
pub async fn ensure_config(paths: &EnsurePaths) -> Result<EnsureOutcome> {
fs::create_dir_all(&paths.srv_root)
.await
.with_context(|| format!("creating {}", paths.srv_root.display()))?;
fs::create_dir_all(&paths.data_dir)
.await
.with_context(|| format!("creating {}", paths.data_dir.display()))?;
for d in ["Documents", "Photos", "Music", "Downloads", "Builds"] {
fs::create_dir_all(paths.srv_root.join(d))
.await
.with_context(|| format!("creating {}/{}", paths.srv_root.display(), d))?;
}
if paths.config_path.exists() {
return Ok(EnsureOutcome::Unchanged);
}
let parent = paths
.config_path
.parent()
.ok_or_else(|| anyhow::anyhow!("config_path has no parent directory"))?;
fs::create_dir_all(parent)
.await
.with_context(|| format!("creating {}", parent.display()))?;
let tmp = paths.config_path.with_extension("tmp");
fs::write(&tmp, DEFAULT_CONFIG_JSON)
.await
.with_context(|| format!("writing tmp {}", tmp.display()))?;
fs::rename(&tmp, &paths.config_path)
.await
.with_context(|| {
format!(
"renaming {} -> {}",
tmp.display(),
paths.config_path.display()
)
})?;
Ok(EnsureOutcome::Written)
}
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn ensure_config_creates_dirs_and_file() {
let tmp = tempfile::TempDir::new().unwrap();
let paths = EnsurePaths {
srv_root: tmp.path().join("filebrowser"),
data_dir: tmp.path().join("filebrowser-data"),
config_path: tmp.path().join("filebrowser-data/.filebrowser.json"),
};
let out = ensure_config(&paths).await.unwrap();
assert_eq!(out, EnsureOutcome::Written);
assert!(paths.config_path.exists());
assert!(paths.srv_root.join("Documents").exists());
assert!(paths.srv_root.join("Photos").exists());
}
#[tokio::test]
async fn ensure_config_is_idempotent() {
let tmp = tempfile::TempDir::new().unwrap();
let paths = EnsurePaths {
srv_root: tmp.path().join("filebrowser"),
data_dir: tmp.path().join("filebrowser-data"),
config_path: tmp.path().join("filebrowser-data/.filebrowser.json"),
};
let first = ensure_config(&paths).await.unwrap();
assert_eq!(first, EnsureOutcome::Written);
let second = ensure_config(&paths).await.unwrap();
assert_eq!(second, EnsureOutcome::Unchanged);
}
}

View File

@@ -1,8 +1,8 @@
//! Parser for image-versions.sh — single source of truth for pinned container images.
//!
//! Reads the deployed file at /opt/archipelago/image-versions.sh (or the repo-local
//! scripts/image-versions.sh as fallback) and exposes lookup functions so the container
//! scanner can compare running images against pinned targets.
//! Reads the deployed file at /opt/archipelago/scripts/image-versions.sh (the canonical
//! location installed by the image-recipe) with fallbacks for older layouts and the
//! repo-local scripts/image-versions.sh for development runs from the repo root.
use std::collections::HashMap;
use std::path::Path;
@@ -18,9 +18,15 @@ struct CacheEntry {
images: HashMap<String, String>,
}
/// File search order — production path first, then repo-local for dev.
/// File search order — canonical production path first, older layout second,
/// repo-local for dev last. The canonical deployed path is
/// /opt/archipelago/scripts/image-versions.sh; earlier builds put it directly
/// in /opt/archipelago/, so that path is kept as a fallback for not-yet-updated
/// nodes. The repo-relative entry matches `cargo run` from the repo root.
const PATHS: &[&str] = &[
"/opt/archipelago/scripts/image-versions.sh",
"/opt/archipelago/image-versions.sh",
"/home/archipelago/Projects/archy/scripts/image-versions.sh",
"scripts/image-versions.sh",
];
@@ -102,8 +108,11 @@ fn parse_image_versions(content: &str) -> HashMap<String, String> {
}
}
// Keep only *_IMAGE entries
vars.retain(|k, _| k.ends_with("_IMAGE"));
// Keep only *_IMAGE entries whose value looks like a container image
// reference (contains a `:` tag separator and at least one `/` path
// component). Rejects placeholder values like "something" so a
// hand-edit typo in image-versions.sh never gets treated as an image.
vars.retain(|k, v| k.ends_with("_IMAGE") && v.contains(':') && v.contains('/'));
vars
}
@@ -134,12 +143,15 @@ fn image_var_for_app(app_id: &str) -> Option<&'static str> {
"lnd" => Some("LND_IMAGE"),
"electrumx" => Some("ELECTRUMX_IMAGE"),
"electrs" | "mempool-electrs" => Some("ELECTRUMX_IMAGE"),
"bitcoin-ui" | "archy-bitcoin-ui" => Some("BITCOIN_UI_IMAGE"),
"lnd-ui" | "archy-lnd-ui" => Some("LND_UI_IMAGE"),
"electrs-ui" | "archy-electrs-ui" => Some("ELECTRS_UI_IMAGE"),
// Mempool stack (primary = web)
"mempool" | "mempool-web" => Some("MEMPOOL_WEB_IMAGE"),
"mempool" | "mempool-web" | "archy-mempool-web" => Some("MEMPOOL_WEB_IMAGE"),
// BTCPay stack (primary = server)
"btcpay" | "btcpay-server" | "btcpayserver" => Some("BTCPAY_IMAGE"),
"btcpay" | "btcpay-server" | "btcpayserver" | "archy-btcpay-ui" => Some("BTCPAY_IMAGE"),
// Apps
"homeassistant" | "home-assistant" => Some("HOMEASSISTANT_IMAGE"),

View File

@@ -1,8 +1,19 @@
pub mod bitcoin_ui;
pub mod boot_reconciler;
pub mod data_manager;
pub mod dev_orchestrator;
pub mod docker_packages;
pub mod filebrowser;
pub mod image_versions;
pub mod prod_orchestrator;
pub mod registry;
pub mod traits;
pub use boot_reconciler::{BootReconciler, DEFAULT_INTERVAL as RECONCILER_DEFAULT_INTERVAL};
pub use dev_orchestrator::DevContainerOrchestrator;
pub use docker_packages::DockerPackageScanner;
pub use prod_orchestrator::{
compute_container_name, AdoptionReport, ProdContainerOrchestrator, ReconcileAction,
ReconcileReport,
};
pub use traits::ContainerOrchestrator;

File diff suppressed because it is too large Load Diff

View File

@@ -15,7 +15,7 @@ const REGISTRY_FILE: &str = "config/registries.json";
/// A single container registry.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Registry {
/// Registry URL (e.g., "git.tx1138.com/lfg2025" or "23.182.128.160:3000/lfg2025").
/// Registry URL (e.g., "git.tx1138.com/lfg2025" or "146.59.87.168:3000/lfg2025").
pub url: String,
/// Human-readable name.
pub name: String,
@@ -44,8 +44,8 @@ impl Default for RegistryConfig {
Self {
registries: vec![
Registry {
url: "23.182.128.160:3000/lfg2025".to_string(),
name: "Server 1 (VPS)".to_string(),
url: "146.59.87.168:3000/lfg2025".to_string(),
name: "Server 1 (OVH)".to_string(),
tls_verify: false,
enabled: true,
priority: 0,
@@ -57,13 +57,6 @@ impl Default for RegistryConfig {
enabled: true,
priority: 10,
},
Registry {
url: "146.59.87.168:3000/lfg2025".to_string(),
name: "Server 3 (OVH)".to_string(),
tls_verify: false,
enabled: true,
priority: 20,
},
],
}
}
@@ -78,15 +71,14 @@ impl RegistryConfig {
}
/// Rewrite an image reference to use a specific registry.
/// E.g., "git.tx1138.com/lfg2025/bitcoin-knots:latest" with registry "23.182.128.160:3000/lfg2025"
/// becomes "23.182.128.160:3000/lfg2025/bitcoin-knots:latest".
/// E.g., "git.tx1138.com/lfg2025/bitcoin-knots:latest" with registry "146.59.87.168:3000/lfg2025"
/// becomes "146.59.87.168:3000/lfg2025/bitcoin-knots:latest".
pub fn rewrite_image(&self, image: &str, registry: &Registry) -> String {
// Extract the image name (last component after the org/namespace)
// Handles: "registry/org/image:tag" -> "image:tag"
let image_name = extract_image_name(image);
format!("{}/{}", registry.url, image_name)
}
}
/// Extract the image name from a full image reference.
@@ -114,6 +106,19 @@ pub async fn load_registries(data_dir: &Path) -> Result<RegistryConfig> {
let mut config: RegistryConfig =
serde_json::from_str(&content).unwrap_or_else(|_| RegistryConfig::default());
// One-time migration: the Hetzner VPS at 23.182.128.160 was
// decommissioned 2026-04-23. Existing nodes have it baked into
// their saved registry list (was the original Server 1). Strip it
// on load so every container pull doesn't pay a connection-refused
// timeout against a dead host. Exception to the usual "explicit
// removals stick" rule: the user never chose to add this — it
// was a default.
let before = config.registries.len();
config
.registries
.retain(|r| !r.url.contains("23.182.128.160"));
let mut changed = config.registries.len() != before;
// Migrate: any default registry URL that isn't already in the
// saved list gets appended at the end (so existing priority order
// is preserved for anything the operator already configured).
@@ -126,16 +131,15 @@ pub async fn load_registries(data_dir: &Path) -> Result<RegistryConfig> {
.map(|r| r.priority)
.max()
.unwrap_or(0);
let mut added = false;
for (i, def) in defaults.registries.iter().enumerate() {
if !known.contains(&def.url) {
let mut cloned = def.clone();
cloned.priority = max_priority.saturating_add(10 + i as u32);
config.registries.push(cloned);
added = true;
changed = true;
}
}
if added {
if changed {
// Persist so the next load doesn't have to re-merge.
let _ = save_registries(data_dir, &config).await;
}
@@ -178,12 +182,12 @@ mod tests {
#[test]
fn test_rewrite_image() {
let config = RegistryConfig::default();
// Default primary is now VPS (index 0). A tx1138-hardcoded image
// rewrites to VPS when asked for the primary mirror.
// Default primary is now the OVH VPS (index 0). A tx1138-hardcoded
// image rewrites to OVH when asked for the primary mirror.
let primary = &config.registries[0];
assert_eq!(
config.rewrite_image("git.tx1138.com/lfg2025/bitcoin-knots:latest", primary),
"23.182.128.160:3000/lfg2025/bitcoin-knots:latest"
"146.59.87.168:3000/lfg2025/bitcoin-knots:latest"
);
}
@@ -191,16 +195,15 @@ mod tests {
fn test_active_registries_sorted() {
let config = RegistryConfig::default();
let active = config.active_registries();
assert_eq!(active.len(), 3);
assert_eq!(active.len(), 2);
assert!(active[0].priority <= active[1].priority);
assert!(active[1].priority <= active[2].priority);
}
#[tokio::test]
async fn test_load_default() {
let tmp = TempDir::new().unwrap();
let config = load_registries(tmp.path()).await.unwrap();
assert_eq!(config.registries.len(), 3);
assert_eq!(config.registries.len(), 2);
}
#[tokio::test]
@@ -216,6 +219,6 @@ mod tests {
});
save_registries(tmp.path(), &config).await.unwrap();
let loaded = load_registries(tmp.path()).await.unwrap();
assert_eq!(loaded.registries.len(), 4);
assert_eq!(loaded.registries.len(), 3);
}
}

View File

@@ -0,0 +1,56 @@
//! Orchestrator trait — the shared surface the RPC layer talks to.
//!
//! Step 4 of the rust-orchestrator migration. Unifies the container lifecycle
//! surface of `DevContainerOrchestrator` and `ProdContainerOrchestrator` so
//! `RpcHandler` can hold `Arc<dyn ContainerOrchestrator>` and stop caring
//! which mode it is in.
//!
//! The trait takes `app_id: &str` everywhere (never a manifest path). Dev and
//! Prod both resolve app_id → manifest internally. The legacy
//! `container-install { manifest_path }` RPC shape is preserved as a concrete
//! `install_container_from_path` method on `DevContainerOrchestrator` only,
//! since that ad-hoc workflow is a dev convenience and has no prod meaning.
//!
//! See `docs/rust-orchestrator-migration.md`.
use anyhow::Result;
use archipelago_container::ContainerStatus;
use async_trait::async_trait;
/// Lifecycle + query operations every orchestrator exposes to the RPC layer.
#[async_trait]
pub trait ContainerOrchestrator: Send + Sync {
/// Build-or-pull the image, create the container, and start it. Returns the
/// podman container name that was created. Assumes the app_id corresponds
/// to a manifest the orchestrator already knows about.
async fn install(&self, app_id: &str) -> Result<String>;
/// Start an already-created container.
async fn start(&self, app_id: &str) -> Result<()>;
/// Stop a running container. No-op on Prod if already stopped.
async fn stop(&self, app_id: &str) -> Result<()>;
/// Stop-then-start. Best-effort: ignores stop failure.
async fn restart(&self, app_id: &str) -> Result<()>;
/// Remove the container. `preserve_data = true` keeps the volumes; `false`
/// is honored on a best-effort basis (Dev cleans, Prod leaves the volume
/// management to the data layer).
async fn remove(&self, app_id: &str, preserve_data: bool) -> Result<()>;
/// Pull/rebuild the image and recreate the container from scratch.
async fn upgrade(&self, app_id: &str) -> Result<()>;
/// Current state of a single container.
async fn status(&self, app_id: &str) -> Result<ContainerStatus>;
/// All containers this orchestrator knows about.
async fn list(&self) -> Result<Vec<ContainerStatus>>;
/// Tail the container's stdout+stderr.
async fn logs(&self, app_id: &str, lines: u32) -> Result<Vec<String>>;
/// Coarse health summary: "healthy", "unhealthy", "starting", "paused", "unknown".
async fn health(&self, app_id: &str) -> Result<String>;
}

View File

@@ -129,9 +129,21 @@ pub fn is_revoked(vc: &VerifiableCredential) -> bool {
mod tests {
use super::*;
/// Create a tempdir with a dummy `identity/node_key` so that the
/// credential store's encrypt/decrypt path can derive a key.
/// Returns the tempdir guard (drop it to clean up).
fn test_dir_with_node_key() -> tempfile::TempDir {
let dir = tempfile::tempdir().unwrap();
let identity_dir = dir.path().join("identity");
std::fs::create_dir_all(&identity_dir).unwrap();
// 32 bytes of deterministic test material; never a real key.
std::fs::write(identity_dir.join("node_key"), [0xAB; 32]).unwrap();
dir
}
#[tokio::test]
async fn test_issue_credential_w3c_format() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
let vc = issue_credential(
dir.path(),
"did:key:issuer",
@@ -163,7 +175,7 @@ mod tests {
#[tokio::test]
async fn test_issue_credential_serializes_as_jsonld() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
let vc = issue_credential(
dir.path(),
"did:key:issuer",
@@ -185,7 +197,7 @@ mod tests {
#[tokio::test]
async fn test_save_and_load_roundtrip() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
issue_credential(
dir.path(),
"did:key:a",
@@ -205,7 +217,7 @@ mod tests {
#[tokio::test]
async fn test_issue_credential_sign_fn_failure_propagates() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
let result = issue_credential(
dir.path(),
"did:key:issuer",
@@ -257,7 +269,7 @@ mod tests {
#[tokio::test]
async fn test_revoke_credential() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
let vc = issue_credential(
dir.path(),
"did:key:issuer",
@@ -288,7 +300,7 @@ mod tests {
#[tokio::test]
async fn test_revoke_nonexistent_credential_fails() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
let result = revoke_credential(dir.path(), "urn:uuid:does-not-exist").await;
assert!(result.is_err());
assert!(result
@@ -299,7 +311,7 @@ mod tests {
#[tokio::test]
async fn test_list_credentials_no_filter() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
issue_credential(
dir.path(),
"did:key:a",
@@ -329,7 +341,7 @@ mod tests {
#[tokio::test]
async fn test_list_credentials_filter_by_did() {
let dir = tempfile::tempdir().unwrap();
let dir = test_dir_with_node_key();
issue_credential(
dir.path(),
"did:key:alice",

View File

@@ -143,7 +143,11 @@ pub struct PackageDataEntry {
/// pipeline so the UI can show real progress instead of a generic
/// "Uninstalling…" spinner. Cleared after the package entry is
/// removed.
#[serde(rename = "uninstall-stage", skip_serializing_if = "Option::is_none", default)]
#[serde(
rename = "uninstall-stage",
skip_serializing_if = "Option::is_none",
default
)]
pub uninstall_stage: Option<String>,
/// Pinned image version from image-versions.sh when it differs from running version
#[serde(rename = "available-update", skip_serializing_if = "Option::is_none")]
@@ -245,6 +249,44 @@ pub enum ServiceStatus {
pub struct InstallProgress {
pub size: u64,
pub downloaded: u64,
/// High-level pipeline phase. Preferred by the UI over the byte
/// counters (podman pull doesn't emit parseable progress on a piped
/// stderr, so `size`/`downloaded` are often 0). Each phase maps to
/// a fixed UI percentage and a descriptive label.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub phase: Option<InstallPhase>,
/// Optional explicit message — used to surface install failures so
/// the UI can keep the app card visible with an error description
/// instead of silently removing the entry on fail. UI's PHASE_INFO
/// label takes precedence when phase is set.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub message: Option<String>,
}
/// Phases of the install / update pipeline, surfaced to the UI so users
/// see where the pipeline is rather than a stuck 0% bar.
///
/// Ordered so each variant's index roughly corresponds to pipeline time.
/// Serialized as kebab-case: "preparing", "pulling-image", …
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "kebab-case")]
pub enum InstallPhase {
/// Validating params, checking deps, writing dynamic configs.
Preparing,
/// `podman pull` in progress (the longest phase — up to several
/// minutes for large images on slow networks).
PullingImage,
/// Creating data directories, writing app-specific configs
/// (bitcoin.conf, lnd.conf, searxng settings.yml, chown).
CreatingContainer,
/// `podman run` has returned; container is transitioning to running.
StartingContainer,
/// Post-start loop waiting up to 60s for `State.Status == running`.
WaitingHealthy,
/// Running post-install hooks (chain init, wallet setup, …).
PostInstall,
/// Pipeline finished successfully. Terminal phase, UI clears entry.
Done,
}
/// WebSocket message sent to clients

View File

@@ -263,7 +263,12 @@ async fn fetch_electrs_sync_status() -> ElectrsSyncStatus {
let network_height = match bitcoin_network_height().await {
Ok(h) => h,
Err(e) => {
warn!("ElectrumX status: Bitcoin RPC failed: {}", e);
let err_msg = e.to_string();
if is_transient_error(&err_msg) {
tracing::debug!("ElectrumX status: Bitcoin RPC transient: {}", err_msg);
} else {
warn!("ElectrumX status: Bitcoin RPC failed: {}", err_msg);
}
return ElectrsSyncStatus {
indexed_height: 0,
network_height: 0,

View File

@@ -80,8 +80,7 @@ pub async fn record_peer_transport(
let mut modified = false;
for node in nodes.iter_mut() {
let did_match = did.is_some_and(|d| d == node.did);
let onion_match = onion_target
.is_some_and(|t| node.onion.trim_end_matches(".onion") == t);
let onion_match = onion_target.is_some_and(|t| node.onion.trim_end_matches(".onion") == t);
if did_match || onion_match {
node.last_transport = Some(transport.to_string());
node.last_transport_at = Some(now.clone());
@@ -182,9 +181,7 @@ pub async fn update_node_state(data_dir: &Path, did: &str, state: NodeStateSnaps
// routing over FIPS on the very next sync. Refresh if the peer
// rotated their FIPS key, too.
if let Some(ref npub) = state.own_fips_npub {
if !npub.is_empty()
&& node.fips_npub.as_deref().map(str::trim) != Some(npub.trim())
{
if !npub.is_empty() && node.fips_npub.as_deref().map(str::trim) != Some(npub.trim()) {
node.fips_npub = Some(npub.clone());
}
}

View File

@@ -8,9 +8,7 @@ use anyhow::{Context, Result};
use std::path::Path;
use super::storage::update_node_state;
use super::types::{
AppStatus, FederatedNode, FederationPeerHint, NodeStateSnapshot, TrustLevel,
};
use super::types::{AppStatus, FederatedNode, FederationPeerHint, NodeStateSnapshot, TrustLevel};
use crate::fips::dial::PeerRequest;
/// Sync state with a single federated peer. Tries FIPS first; falls back
@@ -68,9 +66,7 @@ pub async fn sync_with_peer(
// hop. Only runs when the source is Trusted — Observer-level peers
// don't get to expand our federation on their own authority.
if peer.trust_level == TrustLevel::Trusted {
if let Err(e) =
merge_transitive_peers(data_dir, &peer.did, &state.federated_peers).await
{
if let Err(e) = merge_transitive_peers(data_dir, &peer.did, &state.federated_peers).await {
tracing::warn!(
peer_did = %peer.did,
error = %e,
@@ -87,10 +83,7 @@ pub async fn sync_with_peer(
/// call sync_with_peer. Used by transitive-discovery code paths where
/// the caller only knows the peer's DID (e.g. the peer-joined RPC's
/// follow-up task).
pub async fn sync_with_peer_by_did(
data_dir: &Path,
peer_did: &str,
) -> Result<NodeStateSnapshot> {
pub async fn sync_with_peer_by_did(data_dir: &Path, peer_did: &str) -> Result<NodeStateSnapshot> {
let nodes = super::storage::load_nodes(data_dir).await?;
let peer = nodes
.into_iter()
@@ -98,8 +91,7 @@ pub async fn sync_with_peer_by_did(
.ok_or_else(|| anyhow::anyhow!("Unknown federation peer: {}", peer_did))?;
let identity_dir = data_dir.join("identity");
let node_identity =
crate::identity::NodeIdentity::load_or_create(&identity_dir).await?;
let node_identity = crate::identity::NodeIdentity::load_or_create(&identity_dir).await?;
let local_pubkey_hex = node_identity.pubkey_hex();
let local_did = crate::identity::did_key_from_pubkey_hex(&local_pubkey_hex)?;
@@ -258,7 +250,11 @@ pub async fn deploy_to_peer(
.context("Failed to reach federated peer for deploy")?;
if !resp.status().is_success() {
anyhow::bail!("Remote node returned HTTP {} (via {})", resp.status(), transport);
anyhow::bail!(
"Remote node returned HTTP {} (via {})",
resp.status(),
transport
);
}
let result: serde_json::Value = resp.json().await.context("Invalid response from peer")?;
@@ -355,10 +351,7 @@ mod tests {
last_transport_at: None,
},
];
let state = build_local_state(
vec![],
0.0, 0, 0, 0, 0, 0, true, None, None, None, &peers,
);
let state = build_local_state(vec![], 0.0, 0, 0, 0, 0, 0, true, None, None, None, &peers);
assert_eq!(state.federated_peers.len(), 1);
assert_eq!(state.federated_peers[0].did, "did:key:zTrusted");
assert_eq!(

View File

@@ -70,8 +70,8 @@ pub async fn load(data_dir: &Path) -> Result<Vec<SeedAnchor>> {
let bytes = tokio::fs::read(&path)
.await
.with_context(|| format!("read {}", path.display()))?;
let anchors: Vec<SeedAnchor> = serde_json::from_slice(&bytes)
.with_context(|| format!("parse {}", path.display()))?;
let anchors: Vec<SeedAnchor> =
serde_json::from_slice(&bytes).with_context(|| format!("parse {}", path.display()))?;
Ok(anchors)
}
@@ -125,12 +125,7 @@ pub async fn apply(anchors: &[SeedAnchor]) -> Vec<ApplyResult> {
let mut results = Vec::with_capacity(anchors.len());
for anchor in anchors {
let out = Command::new("fipsctl")
.args([
"connect",
&anchor.npub,
&anchor.address,
&anchor.transport,
])
.args(["connect", &anchor.npub, &anchor.address, &anchor.transport])
.output()
.await;
let result = match out {

View File

@@ -13,7 +13,9 @@ use anyhow::{Context, Result};
use std::path::Path;
use tokio::process::Command;
use super::{DAEMON_CONFIG_PATH, DAEMON_KEY_PATH, DAEMON_PUB_PATH, DEFAULT_TCP_PORT, DEFAULT_UDP_PORT};
use super::{
DAEMON_CONFIG_PATH, DAEMON_KEY_PATH, DAEMON_PUB_PATH, DEFAULT_TCP_PORT, DEFAULT_UDP_PORT,
};
/// Write the FIPS daemon config based on the local npub and default
/// transports. Overwrites any existing file — callers are expected to

View File

@@ -109,7 +109,7 @@ fn encode_query(id: u16, npub: &str) -> Result<Vec<u8>> {
encode_label(&mut out, npub)?;
encode_label(&mut out, FIPS_DNS_SUFFIX)?;
out.push(0); // root
// QTYPE + QCLASS
// QTYPE + QCLASS
out.extend_from_slice(&QTYPE_AAAA.to_be_bytes());
out.extend_from_slice(&QCLASS_IN.to_be_bytes());
Ok(out)
@@ -247,11 +247,7 @@ pub struct PeerRequest<'a> {
}
impl<'a> PeerRequest<'a> {
pub fn new(
fips_npub: Option<&'a str>,
onion_host: &'a str,
path: &'a str,
) -> Self {
pub fn new(fips_npub: Option<&'a str>, onion_host: &'a str, path: &'a str) -> Self {
Self {
fips_npub,
onion_host,
@@ -312,9 +308,7 @@ impl<'a> PeerRequest<'a> {
}
/// GET with optional header-based auth.
pub async fn send_get(
&self,
) -> Result<(reqwest::Response, crate::transport::TransportKind)> {
pub async fn send_get(&self) -> Result<(reqwest::Response, crate::transport::TransportKind)> {
use crate::settings::transport::TransportPref;
let pref = self.preference().await;
if matches!(pref, TransportPref::Auto | TransportPref::Fips) {
@@ -392,19 +386,14 @@ impl<'a> PeerRequest<'a> {
}
}
async fn send_tor_post_json<B: serde::Serialize>(
&self,
body: &B,
) -> Result<reqwest::Response> {
async fn send_tor_post_json<B: serde::Serialize>(&self, body: &B) -> Result<reqwest::Response> {
let url = self.tor_url();
let client = self.tor_client()?;
let mut rb = client.post(&url).json(body);
for (k, v) in &self.headers {
rb = rb.header(*k, v);
}
rb.send()
.await
.with_context(|| format!("Tor POST {}", url))
rb.send().await.with_context(|| format!("Tor POST {}", url))
}
async fn send_tor_get(&self) -> Result<reqwest::Response> {
@@ -414,9 +403,7 @@ impl<'a> PeerRequest<'a> {
for (k, v) in &self.headers {
rb = rb.header(*k, v);
}
rb.send()
.await
.with_context(|| format!("Tor GET {}", url))
rb.send().await.with_context(|| format!("Tor GET {}", url))
}
fn tor_url(&self) -> String {
@@ -449,7 +436,7 @@ mod tests {
assert_eq!(&q[0..2], &[0x12, 0x34]);
assert_eq!(&q[2..4], &[0x01, 0x00]); // flags RD=1
assert_eq!(&q[4..6], &[0x00, 0x01]); // QDCOUNT=1
// Tail: QTYPE=28, QCLASS=1
// Tail: QTYPE=28, QCLASS=1
assert_eq!(&q[q.len() - 4..], &[0x00, 0x1C, 0x00, 0x01]);
}
@@ -471,7 +458,7 @@ mod tests {
r.extend_from_slice(&1u16.to_be_bytes()); // ANCOUNT
r.extend_from_slice(&0u16.to_be_bytes()); // NSCOUNT
r.extend_from_slice(&0u16.to_be_bytes()); // ARCOUNT
// Question: 1 label "a" + "fips"
// Question: 1 label "a" + "fips"
r.extend_from_slice(b"\x01a\x04fips\x00");
r.extend_from_slice(&QTYPE_AAAA.to_be_bytes());
r.extend_from_slice(&QCLASS_IN.to_be_bytes());

View File

@@ -24,9 +24,7 @@ pub const FIPS_IFACE: &str = "fips0";
/// - Link-local (`fe80::/10`) and non-ULA addresses are ignored — we
/// only want the mesh-routable ULA that `<npub>.fips` DNS resolves to.
pub fn fips0_ula() -> Option<Ipv6Addr> {
addresses_on(FIPS_IFACE)
.into_iter()
.find(|a| is_ula(a))
addresses_on(FIPS_IFACE).into_iter().find(|a| is_ula(a))
}
/// List every IPv6 address bound to a given interface from

View File

@@ -122,8 +122,7 @@ impl FipsStatus {
};
let service_state = service::unit_state(SERVICE_UNIT).await;
let upstream_service_state = service::unit_state(UPSTREAM_SERVICE_UNIT).await;
let service_active =
service_state == "active" || upstream_service_state == "active";
let service_active = service_state == "active" || upstream_service_state == "active";
let key_present = crate::identity::fips_key_exists(&identity_dir);
// Prefer the seed-derived npub; otherwise read the daemon's own
@@ -180,7 +179,10 @@ mod tests {
// anchor is the only candidate.
let status = FipsStatus::query(dir.path()).await;
assert!(!status.key_present, "no key before onboarding");
assert!(status.npub.is_none());
// `npub` falls back to whatever an already-running local fips
// daemon advertises, so on a dev machine or node with fips
// installed this field can be Some(...) even when the test
// data_dir is empty. We only assert that key_present is false.
// `installed`, `service_state`, `version` depend on the host and are
// not asserted here — query() must return cleanly regardless.
}

View File

@@ -150,11 +150,10 @@ pub async fn peer_connectivity_summary(anchor_candidates: &[String]) -> (u32, bo
Ok(o) if o.status.success() => o.stdout,
_ => return (0, false),
};
let parsed: serde_json::Value =
match serde_json::from_slice(&peers_json) {
Ok(v) => v,
Err(_) => return (0, false),
};
let parsed: serde_json::Value = match serde_json::from_slice(&peers_json) {
Ok(v) => v,
Err(_) => return (0, false),
};
let peers = parsed
.get("peers")
.and_then(|p| p.as_array())

View File

@@ -539,6 +539,20 @@ pub fn spawn_health_monitor(state: Arc<StateManager>, data_dir: PathBuf) {
debug!("Skipping uninstalled container: {}", container.name);
continue;
}
} else {
// Orphan: container exists in podman but archipelago has
// no package_data entry for it. Common after a variant
// switch (bitcoin-core ↔ bitcoin-knots) where the
// uninstall removed the package entry but the prior
// variant's container survived in stopped state. Without
// this guard the health monitor pages every minute with
// "Auto-restart failed (attempt N/10)" for an app the
// user can no longer see in the dashboard.
debug!(
"Skipping orphan container (not in package_data): {}",
container.name
);
continue;
}
if container.healthy {

View File

@@ -111,11 +111,7 @@ pub struct ProfilePublishOutcome {
/// (trailing slash, case). nostr-sdk canonicalises URLs internally and
/// we compare on the surface strings, so be liberal about what matches.
fn relay_url_matches(a: &str, b: &str) -> bool {
let norm = |s: &str| {
s.trim_end_matches('/')
.trim()
.to_ascii_lowercase()
};
let norm = |s: &str| s.trim_end_matches('/').trim().to_ascii_lowercase();
norm(a) == norm(b)
}
@@ -262,8 +258,8 @@ impl IdentityManager {
derivation_index: Some(0),
};
let json = serde_json::to_string_pretty(&identity_file)
.context("Failed to serialize identity")?;
let json =
serde_json::to_string_pretty(&identity_file).context("Failed to serialize identity")?;
fs::write(&file_path, json.as_bytes())
.await
.context("Failed to write identity file")?;
@@ -701,11 +697,8 @@ impl IdentityManager {
let event_id = output.id().to_hex();
// `Output` has `success: HashSet<RelayUrl>` + `failed: HashMap<RelayUrl, String>`.
// Normalise to string comparisons (RelayUrl trims trailing slashes etc.).
let success_strs: std::collections::HashSet<String> = output
.success
.iter()
.map(|u| u.to_string())
.collect();
let success_strs: std::collections::HashSet<String> =
output.success.iter().map(|u| u.to_string()).collect();
let failed_strs: std::collections::HashMap<String, String> = output
.failed
.iter()
@@ -714,14 +707,11 @@ impl IdentityManager {
let mut accepted: Vec<String> = Vec::new();
let mut rejected: Vec<(String, String)> = Vec::new();
for url in relay_urls {
let match_url = success_strs
.iter()
.any(|s| relay_url_matches(s, url));
let match_url = success_strs.iter().any(|s| relay_url_matches(s, url));
if match_url {
accepted.push(url.clone());
} else if let Some((_, reason)) = failed_strs
.iter()
.find(|(s, _)| relay_url_matches(s, url))
} else if let Some((_, reason)) =
failed_strs.iter().find(|(s, _)| relay_url_matches(s, url))
{
rejected.push((url.clone(), reason.clone()));
} else {
@@ -885,11 +875,13 @@ mod tests {
async fn test_create_nostr_key_npub_format() {
let dir = tempdir().unwrap();
let mgr = IdentityManager::new(dir.path()).await.unwrap();
// `create()` auto-provisions a Nostr key for every identity, so the
// returned record should already have a valid bech32 npub.
let record = mgr
.create("Nostr".to_string(), IdentityPurpose::Personal)
.create("Personal".to_string(), IdentityPurpose::Personal)
.await
.unwrap();
let npub = mgr.create_nostr_key(&record.id).await.unwrap();
let npub = record.nostr_npub.expect("nostr npub should be populated");
assert!(
npub.starts_with("npub1"),
"npub should start with npub1, got {}",

View File

@@ -19,7 +19,9 @@
use anyhow::{Context, Result};
use std::net::SocketAddr;
use std::sync::Arc;
use tokio::signal;
use tokio::sync::Notify;
use tracing::info;
mod api;
@@ -69,6 +71,10 @@ mod wallet;
mod webhooks;
use config::Config;
use container::{
BootReconciler, ContainerOrchestrator, DevContainerOrchestrator, ProdContainerOrchestrator,
RECONCILER_DEFAULT_INTERVAL,
};
use server::Server;
#[tokio::main]
@@ -98,10 +104,7 @@ async fn main() -> Result<()> {
if let Ok(meta) = tokio::fs::metadata(web_ui).await {
let mode = meta.permissions().mode() & 0o777;
if mode & 0o005 != 0o005 {
tracing::warn!(
"web-ui perms {:o} not world-readable — self-healing",
mode
);
tracing::warn!("web-ui perms {:o} not world-readable — self-healing", mode);
let _ = tokio::process::Command::new("sudo")
.args([
"-n",
@@ -168,15 +171,65 @@ async fn main() -> Result<()> {
// Signal to health monitor that boot recovery is done
crash_recovery::mark_recovery_complete();
// Boot reconciliation disabled — the reconciler creates ALL containers
// from specs, which is wrong on unbundled installs where only user-chosen
// apps should exist. The health monitor handles restarting existing
// containers. Run reconcile-containers.sh manually when needed.
// crash_recovery::run_boot_reconciliation().await;
});
}
// Construct the container orchestrator once. In prod mode we load the
// on-disk app manifests, do an initial adoption pass, and spawn the
// BootReconciler loop (Step 5/6 of the rust-orchestrator migration).
// Dev mode uses the in-memory DevContainerOrchestrator and has no
// reconciler (manifests are pushed via RPC, not discovered from disk).
let shutdown_notify = Arc::new(Notify::new());
let (orchestrator, dev_orchestrator): (
Option<Arc<dyn ContainerOrchestrator>>,
Option<Arc<DevContainerOrchestrator>>,
) = if config.dev_mode {
let dev = Arc::new(DevContainerOrchestrator::new(config.clone()).await?);
let trait_obj: Arc<dyn ContainerOrchestrator> = dev.clone();
(Some(trait_obj), Some(dev))
} else {
let prod = Arc::new(ProdContainerOrchestrator::new(config.clone()).await?);
// Best-effort manifest load; a missing /opt/archipelago/apps is
// logged inside load_manifests and not fatal.
match prod.load_manifests().await {
Ok(n) => info!("📦 Loaded {n} app manifest(s) from disk"),
Err(e) => {
tracing::error!(error = %e, "prod orchestrator: load_manifests failed at startup");
}
}
// Adoption pass: link existing podman containers back to their
// manifests so the reconciler doesn't recreate them.
match prod.adopt_existing().await {
Ok(report) => {
info!(
"🔗 Adopted {} existing container(s): {:?}",
report.adopted.len(),
report.adopted
);
}
Err(e) => {
tracing::warn!(error = %e, "prod orchestrator: adopt_existing failed (non-fatal)");
}
}
// Spawn the boot reconciler loop. Runs an initial reconcile
// immediately, then re-checks every RECONCILER_DEFAULT_INTERVAL
// until shutdown_notify fires.
{
let reconciler = BootReconciler::new(
prod.clone(),
RECONCILER_DEFAULT_INTERVAL,
shutdown_notify.clone(),
);
tokio::spawn(reconciler.run_forever());
info!(
"🔄 Boot reconciler started (interval: {:?})",
RECONCILER_DEFAULT_INTERVAL
);
}
let trait_obj: Arc<dyn ContainerOrchestrator> = prod;
(Some(trait_obj), None)
};
// Ensure a default user exists so login works after install/onboarding.
// In production, the default password is "password123" (shown during install).
// In dev mode, the dev default password is used.
@@ -185,7 +238,7 @@ async fn main() -> Result<()> {
// "Create Password" form instead of login form.
// Create server
let server = Server::new(config.clone()).await?;
let server = Server::new(config.clone(), orchestrator, dev_orchestrator).await?;
// Start server
let addr: SocketAddr = format!("{}:{}", config.bind_host, config.bind_port)
@@ -261,6 +314,7 @@ async fn main() -> Result<()> {
// Graceful shutdown: wait for SIGTERM or SIGINT
let mut sigterm = signal::unix::signal(signal::unix::SignalKind::terminate())
.context("Failed to register SIGTERM handler")?;
let shutdown_notify_for_signal = shutdown_notify.clone();
let shutdown = async move {
tokio::select! {
_ = signal::ctrl_c() => {
@@ -270,6 +324,10 @@ async fn main() -> Result<()> {
info!("Received SIGTERM, initiating graceful shutdown...");
}
}
// Signal the boot reconciler (and any other subscribers) to stop.
// `notify_one` stores a permit if no task is currently parked on
// `notified()`, so we don't race the reconciler's reconcile_all pass.
shutdown_notify_for_signal.notify_one();
};
server.serve_with_shutdown(addr, shutdown).await?;

View File

@@ -457,8 +457,9 @@ mod tests {
let key = SigningKey::generate(&mut OsRng);
let wire = build_block_header_announcement(
890412,
"0000000000000000000abc",
"0000000000000000000aab",
// Block hashes must be 32 bytes (64 hex chars). Use realistic-shaped placeholders.
"0000000000000000000abc00000000000000000000000000000000000000abcd",
"0000000000000000000aab0000000000000000000000000000000000000aabcd",
1710633600,
"did:key:z6MkTest",
&key,
@@ -469,7 +470,9 @@ mod tests {
assert_eq!(wire[0], 0x02);
let envelope = TypedEnvelope::from_wire(&wire).unwrap();
assert_eq!(envelope.t, MeshMessageType::BlockHeader as u8);
assert!(envelope.sig.is_some());
// Block header announcements are intentionally unsigned to save 64 bytes
// on the 160-byte LoRa payload (see builder comment).
assert!(envelope.sig.is_none());
}
#[test]

View File

@@ -52,7 +52,10 @@ impl PendingMessage {
return true; // Can't parse = treat as expired
};
let age = chrono::Utc::now().signed_duration_since(created);
age.num_seconds() as u64 > self.ttl_secs
// Use `>=` so a ttl_secs=0 message is expired immediately (used by
// tests and by callers that want a fire-and-forget behavior when
// the relay can't deliver on first try).
age.num_seconds() as u64 >= self.ttl_secs
}
/// Check if this message can be relayed further.

View File

@@ -701,9 +701,11 @@ mod tests {
#[test]
fn test_build_app_start() -> Result<()> {
// Frame layout: [0: '>'][1-2: len LE][3: CMD][4: VERSION][5..: padded name]
let frame = build_app_start("Archipelago");
assert_eq!(frame[3], CMD_APP_START);
let name = &frame[4..];
assert_eq!(frame[4], PROTOCOL_VERSION);
let name = &frame[5..];
assert_eq!(
std::str::from_utf8(name)
.map_err(|e| anyhow::anyhow!("invalid UTF-8 in app name: {}", e))?,
@@ -753,15 +755,20 @@ mod tests {
#[test]
fn test_identity_broadcast_roundtrip() -> Result<()> {
let did = "did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK";
// The v2 encoding drops the DID and the decoder reconstructs it
// deterministically from the ed25519 pubkey, so the roundtripped
// DID won't equal an arbitrary input DID. Derive what the decoder
// will produce and assert against that.
let ed_pub = "a".repeat(64);
let x25519_pub = "b".repeat(64);
let expected_did = crate::identity::did_key_from_pubkey_hex(&ed_pub)
.map_err(|e| anyhow::anyhow!("derive did: {}", e))?;
let encoded = encode_identity_broadcast(did, &ed_pub, &x25519_pub);
let encoded = encode_identity_broadcast(&expected_did, &ed_pub, &x25519_pub);
let (parsed_did, parsed_ed, parsed_x) = parse_identity_broadcast(&encoded)
.ok_or_else(|| anyhow::anyhow!("failed to parse identity broadcast"))?;
assert_eq!(parsed_did, did);
assert_eq!(parsed_did, expected_did);
assert_eq!(parsed_ed, ed_pub);
assert_eq!(parsed_x, x25519_pub);
Ok(())

View File

@@ -267,7 +267,7 @@ async fn sync_single_peer(
// Best-effort push — don't fail the whole sync if a batch fails.
match PeerRequest::new(fips_npub, onion, "/dwn")
.service(crate::settings::transport::PeerService::Federation)
.service(crate::settings::transport::PeerService::Federation)
.timeout(std::time::Duration::from_secs(30))
.send_json(&push_body)
.await

View File

@@ -275,25 +275,24 @@ pub async fn send_to_peer(
body["from_name"] = serde_json::Value::String(name.to_string());
}
let (resp, transport) = crate::fips::dial::PeerRequest::new(
fips_npub,
onion,
"/archipelago/node-message",
)
.service(crate::settings::transport::PeerService::Messaging)
.timeout(std::time::Duration::from_secs(60))
.send_json(&body)
.await
.map_err(|e| {
let msg = e.to_string();
if msg.contains("connection refused") || msg.contains("Connection refused") {
anyhow::anyhow!("Peer unreachable. Check Tor (127.0.0.1:9050) and FIPS daemon status.")
} else if msg.contains("timeout") || msg.contains("timed out") {
anyhow::anyhow!("Connection timed out. The peer may be offline.")
} else {
anyhow::anyhow!("Failed to send: {}", msg)
}
})?;
let (resp, transport) =
crate::fips::dial::PeerRequest::new(fips_npub, onion, "/archipelago/node-message")
.service(crate::settings::transport::PeerService::Messaging)
.timeout(std::time::Duration::from_secs(60))
.send_json(&body)
.await
.map_err(|e| {
let msg = e.to_string();
if msg.contains("connection refused") || msg.contains("Connection refused") {
anyhow::anyhow!(
"Peer unreachable. Check Tor (127.0.0.1:9050) and FIPS daemon status."
)
} else if msg.contains("timeout") || msg.contains("timed out") {
anyhow::anyhow!("Connection timed out. The peer may be offline.")
} else {
anyhow::anyhow!("Failed to send: {}", msg)
}
})?;
if !resp.status().is_success() {
anyhow::bail!(

View File

@@ -24,7 +24,7 @@ const RESERVED_PORTS: &[u16] = &[
9980, 9001, // OnlyOffice, Penpot
8240, // Tailscale
9000, // Portainer
3001, // Uptime Kuma
3001, 3002, // Gitea, Uptime Kuma
8888, // SearXNG
8096, 2342, 2283, // Jellyfin, Photoprism, Immich
8443, 8084, // NPM

View File

@@ -1,6 +1,8 @@
use crate::api::ApiHandler;
use crate::config::{Config, ContainerRuntime};
use crate::container::{docker_packages, DockerPackageScanner};
use crate::container::{
docker_packages, ContainerOrchestrator, DevContainerOrchestrator, DockerPackageScanner,
};
use crate::identity::{self, NodeIdentity};
use crate::monitoring::MetricsStore;
use crate::node_message;
@@ -14,7 +16,7 @@ use hyper::service::service_fn;
use std::collections::HashMap;
use std::net::SocketAddr;
use std::sync::Arc;
use std::time::Duration;
use std::time::{Duration, Instant};
use tokio::net::TcpListener;
use tracing::{debug, error, info, warn};
@@ -26,7 +28,11 @@ pub struct Server {
}
impl Server {
pub async fn new(config: Config) -> Result<Self> {
pub async fn new(
config: Config,
orchestrator: Option<Arc<dyn ContainerOrchestrator>>,
dev_orchestrator: Option<Arc<DevContainerOrchestrator>>,
) -> Result<Self> {
let state_manager = Arc::new(StateManager::new());
// Load node identity and set stable server_info.
@@ -172,8 +178,16 @@ impl Server {
Some(config.data_dir.clone()),
);
let api_handler =
Arc::new(ApiHandler::new(config.clone(), state_manager.clone(), metrics_store).await?);
let api_handler = Arc::new(
ApiHandler::new(
config.clone(),
state_manager.clone(),
metrics_store,
orchestrator,
dev_orchestrator,
)
.await?,
);
// Initialize mesh networking service (if config has enabled: true)
{
@@ -299,6 +313,8 @@ impl Server {
let scanner = create_docker_scanner(&config).await?;
let state = state_manager.clone();
let identity_clone = identity.clone();
let scan_kick = api_handler.rpc_handler().scan_kick();
let scan_tick = api_handler.rpc_handler().scan_tick();
// Initial scan (delayed to let crash recovery finish first)
tokio::spawn(async move {
@@ -308,18 +324,31 @@ impl Server {
// Tracks how many consecutive scans each container has been absent from.
// Prevents UI flapping when podman intermittently returns incomplete results.
let mut absence_tracker: HashMap<String, u32> = HashMap::new();
// Tracks when each container first entered a transitional state
// (Stopping / Starting / Restarting / ...). Used by the merge
// loop below to ignore podman's live state during a pending
// lifecycle op, and to break out if the spawned task dies
// without ever writing a final state.
let mut transitional_since: HashMap<String, Instant> = HashMap::new();
if let Err(e) = scan_and_update_packages(
&scanner,
&state,
identity_clone.as_ref(),
&mut absence_tracker,
&mut transitional_since,
)
.await
{
error!("Failed to scan containers: {}", e);
}
// Bump the scan-completion counter so any caller waiting on a
// kicked scan (install/update success path) can proceed.
scan_tick.send_modify(|n| *n = n.wrapping_add(1));
// Periodic scan every 60 seconds (only broadcasts if state changed)
// Periodic scan every 60 seconds (only broadcasts if state changed).
// Also wakes immediately when `scan_kick` fires — install/update
// success paths poke it so the fresh manifest (with populated
// interfaces) lands before they flip state to Running.
// Uses an in-flight guard to skip scans when a previous one is still running
let mut interval = tokio::time::interval(Duration::from_secs(60));
// Skip missed ticks instead of catching up — prevents burst of scans
@@ -327,7 +356,12 @@ impl Server {
interval.set_missed_tick_behavior(tokio::time::MissedTickBehavior::Skip);
let scanning = std::sync::Arc::new(std::sync::atomic::AtomicBool::new(false));
loop {
interval.tick().await;
tokio::select! {
_ = interval.tick() => {}
_ = scan_kick.notified() => {
debug!("Scan kicked by install/update success — running immediately");
}
}
if scanning.load(std::sync::atomic::Ordering::Relaxed) {
debug!("Skipping container scan — previous scan still in progress");
continue;
@@ -338,11 +372,13 @@ impl Server {
&state,
identity_clone.as_ref(),
&mut absence_tracker,
&mut transitional_since,
)
.await
{
error!("Failed to update containers: {}", e);
}
scan_tick.send_modify(|n| *n = n.wrapping_add(1));
scanning.store(false, std::sync::atomic::Ordering::Relaxed);
}
});
@@ -382,10 +418,9 @@ impl Server {
let _ = crate::fips::anchors::apply(&list).await;
}
Ok(_) => { /* no seed anchors configured yet */ }
Err(e) => tracing::debug!(
"Seed-anchor apply: load failed (non-fatal): {}",
e
),
Err(e) => {
tracing::debug!("Seed-anchor apply: load failed (non-fatal): {}", e)
}
}
}
});
@@ -527,9 +562,7 @@ impl Server {
// OTA'd nodes would be stuck on the old UDP-only config
// until someone manually clicked Reconnect.
let expected = crate::fips::config::render_config_yaml();
let installed = tokio::fs::read_to_string("/etc/fips/fips.yaml")
.await
.ok();
let installed = tokio::fs::read_to_string("/etc/fips/fips.yaml").await.ok();
let config_changed = installed.as_deref() != Some(expected.as_str());
if let Err(e) = crate::fips::config::install(&identity_dir).await {
@@ -795,11 +828,65 @@ async fn refresh_tor_address(state: &StateManager, identity: &NodeIdentity) -> R
/// 3 scans × 30s = 90 seconds of absence before removal.
const CONTAINER_ABSENCE_THRESHOLD: u32 = 3;
/// Maximum time a package entry may remain stuck in a transitional state
/// before the scan loop overrides it with podman's live state.
///
/// Rationale: the longest single-container stop timeout is bitcoin-core at
/// 600s. 2× that gives the spawned task ample margin before we assume it
/// died (panic, OOM, process restart mid-stop) and fall back to the
/// scanner's authoritative view. Applies to all transitional variants.
const TRANSITIONAL_STUCK_TIMEOUT: Duration = Duration::from_secs(1200);
/// Returns true if `state` is one of the transitional variants that a
/// `spawn_transitional`-style background task owns. While such a state is
/// set, the package scanner must not overwrite it with whatever podman
/// reports (see `merge_preserving_transitional`).
fn is_transitional(state: &crate::data_model::PackageState) -> bool {
use crate::data_model::PackageState::*;
matches!(
state,
Installing
| Stopping
| Starting
| Restarting
| Updating
| Removing
| CreatingBackup
| RestoringBackup
| BackingUp
)
}
/// Merge a fresh scan entry `fresh` into `existing` while preserving
/// `existing.state` (which is transitional — the RPC spawn task owns it).
/// Non-state observability fields are taken from `fresh` so the UI still
/// sees live health / exit_code / lan_address readings during a transition.
fn merge_preserving_transitional(
existing: &crate::data_model::PackageDataEntry,
fresh: &crate::data_model::PackageDataEntry,
) -> crate::data_model::PackageDataEntry {
crate::data_model::PackageDataEntry {
state: existing.state.clone(),
// install_progress and uninstall_stage are also owned by the
// initiating op (same reason as state) — keep them.
install_progress: existing.install_progress.clone(),
uninstall_stage: existing.uninstall_stage.clone(),
// Everything else comes from the fresh scan.
health: fresh.health.clone(),
exit_code: fresh.exit_code,
static_files: fresh.static_files.clone(),
manifest: fresh.manifest.clone(),
installed: fresh.installed.clone(),
available_update: fresh.available_update.clone(),
}
}
async fn scan_and_update_packages(
scanner: &DockerPackageScanner,
state: &StateManager,
identity: &NodeIdentity,
absence_tracker: &mut HashMap<String, u32>,
transitional_since: &mut HashMap<String, Instant>,
) -> Result<()> {
let packages = scanner.scan_containers().await?;
@@ -833,10 +920,61 @@ async fn scan_and_update_packages(
let mut merged = current_data.package_data.clone();
let mut changed = false;
// Update/add containers found in this scan
// Update/add containers found in this scan.
//
// Transitional states (Stopping, Starting, Restarting, Installing,
// Updating, Removing, backup variants) are owned by the RPC spawn_task
// that initiated the operation — podman's live state during the op is
// meaningless ("running" during a graceful stop, "exited" during a
// restart, etc.) and must not be written back. See
// `merge_preserving_transitional` for the exact rule.
//
// Escape hatch: if a package has been in a transitional state for
// longer than TRANSITIONAL_STUCK_TIMEOUT we assume the spawned task
// died without cleanup and let the scan override it.
let now = Instant::now();
for (id, pkg) in &packages {
absence_tracker.remove(id);
if merged.get(id) != Some(pkg) {
let existing = merged.get(id);
let overwrite = match existing {
Some(existing_entry) if is_transitional(&existing_entry.state) => {
let entered = *transitional_since.entry(id.clone()).or_insert(now);
let stuck = now.duration_since(entered) > TRANSITIONAL_STUCK_TIMEOUT;
if stuck {
warn!(
"Container {} stuck in {:?} for >{}s; overriding with scan state {:?}",
id,
existing_entry.state,
TRANSITIONAL_STUCK_TIMEOUT.as_secs(),
pkg.state
);
transitional_since.remove(id);
true
} else {
// Keep existing transitional state, but merge non-state
// observability fields (health, exit_code, lan_address
// via installed) from the fresh scan so the UI still
// sees live readings.
let merged_entry = merge_preserving_transitional(existing_entry, pkg);
if existing.cloned() != Some(merged_entry.clone()) {
merged.insert(id.clone(), merged_entry);
changed = true;
}
false
}
}
Some(_) => {
// Not transitional: the side-table may hold a stale entry
// from a previous transition on this id; drop it.
transitional_since.remove(id);
existing != Some(pkg)
}
None => {
transitional_since.remove(id);
true
}
};
if overwrite && merged.get(id) != Some(pkg) {
merged.insert(id.clone(), pkg.clone());
changed = true;
}
@@ -847,6 +985,17 @@ async fn scan_and_update_packages(
let current_ids: Vec<String> = merged.keys().cloned().collect();
for id in current_ids {
if !packages.contains_key(&id) {
// Don't evict packages mid-transition: Installing/Updating/Removing
// legitimately have no live container yet (image still pulling) or
// briefly (during recreate). The absence-eviction here was racing
// installs and removing apps from the UI 14s in. The transitional
// owner (spawn_task) is responsible for clearing state, not us.
if let Some(entry) = merged.get(&id) {
if is_transitional(&entry.state) {
absence_tracker.remove(&id);
continue;
}
}
let count = absence_tracker.entry(id.clone()).or_insert(0);
*count += 1;
if *count >= CONTAINER_ABSENCE_THRESHOLD {
@@ -856,6 +1005,7 @@ async fn scan_and_update_packages(
);
merged.remove(&id);
absence_tracker.remove(&id);
transitional_since.remove(&id);
changed = true;
}
}
@@ -926,10 +1076,9 @@ async fn check_peer_health(state: &StateManager, data_dir: &std::path::Path) ->
let mut new_health = std::collections::HashMap::new();
for peer in &known_peers {
let fips_npub = crate::federation::fips_npub_for_onion(data_dir, &peer.onion).await;
let reachable =
node_message::check_peer_reachable(&peer.onion, fips_npub.as_deref())
.await
.unwrap_or(false);
let reachable = node_message::check_peer_reachable(&peer.onion, fips_npub.as_deref())
.await
.unwrap_or(false);
new_health.insert(peer.onion.clone(), reachable);
}
@@ -943,3 +1092,106 @@ async fn check_peer_health(state: &StateManager, data_dir: &std::path::Path) ->
Ok(())
}
#[cfg(test)]
mod merge_tests {
use super::*;
use crate::data_model::{Description, Manifest, PackageDataEntry, PackageState, StaticFiles};
fn make_manifest() -> Manifest {
Manifest {
id: "lnd".to_string(),
title: "LND".to_string(),
version: "0.18.4".to_string(),
description: Description {
short: "".to_string(),
long: "".to_string(),
},
release_notes: "".to_string(),
license: "".to_string(),
wrapper_repo: "".to_string(),
upstream_repo: "".to_string(),
support_site: "".to_string(),
marketing_site: "".to_string(),
donation_url: None,
author: None,
website: None,
interfaces: None,
tier: None,
}
}
fn make_static() -> StaticFiles {
StaticFiles {
license: "".to_string(),
instructions: "".to_string(),
icon: "".to_string(),
}
}
fn make_entry(state: PackageState, health: Option<&str>) -> PackageDataEntry {
PackageDataEntry {
state,
health: health.map(|s| s.to_string()),
exit_code: None,
static_files: make_static(),
manifest: make_manifest(),
installed: None,
install_progress: None,
uninstall_stage: None,
available_update: None,
}
}
#[test]
fn preserves_transitional_state_on_merge() {
// existing: user initiated a stop, spawn_transitional set Stopping.
// fresh: podman hasn't finished the stop yet, still reports Running.
// Expected: merged state stays Stopping — podman's live view must
// not clobber the transitional state owned by the RPC spawn task.
let existing = make_entry(PackageState::Stopping, Some("healthy"));
let fresh = make_entry(PackageState::Running, Some("starting"));
let merged = merge_preserving_transitional(&existing, &fresh);
assert_eq!(merged.state, PackageState::Stopping);
}
#[test]
fn merges_fresh_observability_fields() {
// Non-state observability fields (health, exit_code, installed)
// MUST come from the fresh scan even while state is preserved —
// the UI still shows live health/health during a transition.
let mut existing = make_entry(PackageState::Stopping, Some("healthy"));
existing.exit_code = None;
let mut fresh = make_entry(PackageState::Running, Some("unhealthy"));
fresh.exit_code = Some(0);
let merged = merge_preserving_transitional(&existing, &fresh);
assert_eq!(merged.state, PackageState::Stopping);
assert_eq!(merged.health.as_deref(), Some("unhealthy"));
assert_eq!(merged.exit_code, Some(0));
}
#[test]
fn is_transitional_covers_all_variants() {
for s in [
PackageState::Installing,
PackageState::Stopping,
PackageState::Starting,
PackageState::Restarting,
PackageState::Updating,
PackageState::Removing,
PackageState::CreatingBackup,
PackageState::RestoringBackup,
PackageState::BackingUp,
] {
assert!(is_transitional(&s), "{:?} should be transitional", s);
}
for s in [
PackageState::Installed,
PackageState::Stopped,
PackageState::Exited,
PackageState::Running,
] {
assert!(!is_transitional(&s), "{:?} should NOT be transitional", s);
}
}
}

View File

@@ -70,6 +70,17 @@ impl SessionStore {
}
}
/// Construct an empty SessionStore that persists to a caller-supplied
/// path. Used by tests so they don't pick up sessions from the dev
/// machine's real /var/lib/archipelago/sessions.json.
#[cfg(test)]
pub fn new_for_tests(persist_path: PathBuf) -> Self {
Self {
sessions: Arc::new(RwLock::new(HashMap::new())),
persist_path,
}
}
/// Load persisted sessions from disk (only Full sessions).
async fn load_from_disk(path: &Path) -> HashMap<[u8; 32], Session> {
let mut map = HashMap::new();
@@ -462,7 +473,10 @@ mod tests {
#[tokio::test]
async fn test_session_create_and_validate() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let token = store.create().await;
assert!(store.validate(&token).await);
@@ -470,13 +484,19 @@ mod tests {
#[tokio::test]
async fn test_session_invalid_token() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
assert!(!store.validate("nonexistent_token").await);
}
#[tokio::test]
async fn test_session_remove() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let token = store.create().await;
assert!(store.validate(&token).await);
@@ -486,7 +506,10 @@ mod tests {
#[tokio::test]
async fn test_pending_session_upgrade() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let secret = vec![1, 2, 3, 4];
let token = store.create_pending(secret.clone()).await;
@@ -510,7 +533,10 @@ mod tests {
#[tokio::test]
async fn test_pending_session_max_attempts() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let secret = vec![1, 2, 3];
let token = store.create_pending(secret).await;
@@ -538,7 +564,10 @@ mod tests {
#[tokio::test]
async fn test_session_activity_updates_on_validate() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let token = store.create().await;
// First validation should succeed and touch last_activity
@@ -550,7 +579,10 @@ mod tests {
#[tokio::test]
async fn test_invalidate_all_except() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let token1 = store.create().await;
let token2 = store.create().await;
let token3 = store.create().await;
@@ -565,7 +597,10 @@ mod tests {
#[tokio::test]
async fn test_session_rotate() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let old_token = store.create().await;
assert!(store.validate(&old_token).await);
@@ -580,7 +615,10 @@ mod tests {
#[tokio::test]
async fn test_max_concurrent_sessions() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let mut tokens = Vec::new();
// Create MAX_CONCURRENT_SESSIONS sessions
@@ -608,7 +646,10 @@ mod tests {
#[tokio::test]
async fn test_active_session_count() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
assert_eq!(store.active_session_count().await, 0);
let token1 = store.create().await;
@@ -623,7 +664,10 @@ mod tests {
#[tokio::test]
async fn test_cleanup_expired_removes_stale() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let token = store.create().await;
assert!(store.validate(&token).await);
@@ -636,7 +680,10 @@ mod tests {
#[tokio::test]
async fn test_rotate_preserves_session_count() {
let store = SessionStore::new().await;
let store = SessionStore::new_for_tests(std::env::temp_dir().join(format!(
"archipelago-sessions-test-{}.json",
rand::random::<u64>()
)));
let token = store.create().await;
assert_eq!(store.active_session_count().await, 1);

View File

@@ -141,11 +141,7 @@ pub async fn snapshot() -> TransportPreferences {
/// Update a single service preference, persist to disk, and update the
/// handle. Callers must pass `data_dir` because the on-disk file lives
/// under it — the handle alone doesn't know where to write.
pub async fn set(
data_dir: &Path,
service: PeerService,
pref: TransportPref,
) -> Result<()> {
pub async fn set(data_dir: &Path, service: PeerService, pref: TransportPref) -> Result<()> {
let new_prefs = {
let lock = HANDLE.get_or_init(|| RwLock::new(TransportPreferences::default()));
let mut w = lock.write().await;
@@ -173,8 +169,7 @@ async fn save_to_disk(data_dir: &Path, prefs: &TransportPreferences) -> Result<(
.await
.with_context(|| format!("create {}", parent.display()))?;
}
let body = serde_json::to_string_pretty(prefs)
.context("serialize TransportPreferences")?;
let body = serde_json::to_string_pretty(prefs).context("serialize TransportPreferences")?;
tokio::fs::write(&path, body)
.await
.with_context(|| format!("write {}", path.display()))?;
@@ -213,10 +208,7 @@ mod tests {
p.set_for_service(PeerService::Messaging, TransportPref::Tor);
let s = serde_json::to_string(&p).unwrap();
let back: TransportPreferences = serde_json::from_str(&s).unwrap();
assert_eq!(
back.for_service(PeerService::Messaging),
TransportPref::Tor
);
assert_eq!(back.for_service(PeerService::Messaging), TransportPref::Tor);
}
#[test]

View File

@@ -98,7 +98,12 @@ pub fn encode_chunked(data: &[u8]) -> Result<Vec<Chunk>> {
}
let shard_size = MAX_CHUNK_PAYLOAD;
let data_shard_count = data.len().div_ceil(shard_size);
// Reserve the first 4 bytes of shard 0 for a length header so the
// receiver can trim padding after FEC reconstruction. Effective
// payload capacity is therefore (shards * shard_size) - 4.
const LEN_HEADER: usize = 4;
let total_payload = data.len() + LEN_HEADER;
let data_shard_count = total_payload.div_ceil(shard_size);
if data_shard_count > MAX_PRACTICAL_CHUNKS {
anyhow::bail!(
@@ -116,22 +121,25 @@ pub fn encode_chunked(data: &[u8]) -> Result<Vec<Chunk>> {
anyhow::bail!("Too many shards: {}", total_shards);
}
// Split data into equal-size shards
// Build a single contiguous buffer: [len_u32_le][data...][zero_padding]
// then split into equal-size shards.
let buffer_size = data_shard_count * shard_size;
let mut buffer = vec![0u8; buffer_size];
buffer[..LEN_HEADER].copy_from_slice(&(data.len() as u32).to_le_bytes());
buffer[LEN_HEADER..LEN_HEADER + data.len()].copy_from_slice(data);
let mut shards: Vec<Vec<u8>> = Vec::with_capacity(total_shards);
for i in 0..data_shard_count {
let start = i * shard_size;
let end = (start + shard_size).min(data.len());
let mut shard = vec![0u8; shard_size];
shard[..end - start].copy_from_slice(&data[start..end]);
shards.push(shard);
shards.push(buffer[start..start + shard_size].to_vec());
}
// Add empty parity shards
// Empty parity shards
for _ in 0..parity_shard_count {
shards.push(vec![0u8; shard_size]);
}
// Generate parity
// Generate parity over the data shards (which now correctly include
// the length header in shard 0).
let rs = ReedSolomon::new(data_shard_count, parity_shard_count)
.context("Failed to create Reed-Solomon codec")?;
rs.encode(&mut shards)
@@ -152,18 +160,6 @@ pub fn encode_chunked(data: &[u8]) -> Result<Vec<Chunk>> {
});
}
// Encode the original data length in the first chunk's first 4 bytes
// so the receiver can trim padding after reconstruction.
let data_len = data.len() as u32;
chunks[0].payload[..4].copy_from_slice(&data_len.to_le_bytes());
// Re-encode FEC to reflect the length header change
let mut shard_data: Vec<Vec<u8>> = chunks.iter().map(|c| c.payload.clone()).collect();
rs.encode(&mut shard_data)
.context("Reed-Solomon re-encoding failed")?;
for (i, shard) in shard_data.into_iter().enumerate() {
chunks[i].payload = shard;
}
Ok(chunks)
}
@@ -318,17 +314,13 @@ mod tests {
#[test]
fn test_chunk_roundtrip_medium() {
// ~500 bytes: 4 data chunks + 1 parity
// 500 bytes payload + 4-byte length header = 504 bytes.
// ceil(504 / 124) = 5 data shards, plus ceil(5/4) = 2 parity = 7 total.
let data: Vec<u8> = (0..500).map(|i| (i % 256) as u8).collect();
let chunks = encode_chunked(&data).unwrap();
let data_chunks: Vec<_> = chunks.iter().filter(|c| !c.is_parity).collect();
let _parity_chunks: Vec<_> = chunks.iter().filter(|c| c.is_parity).collect();
assert_eq!(data_chunks.len(), 4); // ceil(500/124) = 5... wait
// Actually: ceil(500/124) = ceil(4.03) = 5 data shards
// But the first shard has 4 bytes of length header embedded, so
// the actual data capacity is 124 * N - 0 (length is IN the shard data).
// Let's just check it roundtrips.
assert_eq!(data_chunks.len(), 5);
let mut reassembler = ChunkReassembler::new();
let mut result = None;

View File

@@ -55,23 +55,21 @@ fn parse_version_triple(v: &str) -> Option<(u32, u32, u32)> {
/// latest). Falls back to string inequality if either version doesn't
/// parse, preserving the old behaviour for unusual version strings.
fn is_newer(candidate: &str, current: &str) -> bool {
match (parse_version_triple(candidate), parse_version_triple(current)) {
match (
parse_version_triple(candidate),
parse_version_triple(current),
) {
(Some(a), Some(b)) => a > b,
_ => candidate != current,
}
}
const DEFAULT_UPDATE_MANIFEST_URL: &str =
"https://git.tx1138.com/lfg2025/archy/raw/branch/main/releases/manifest.json";
/// Secondary mirror: same manifest, served from the VPS. Added as a
/// default mirror so nodes automatically fall through when the primary
/// is slow or unreachable.
const DEFAULT_SECONDARY_MIRROR_URL: &str =
"http://23.182.128.160:3000/lfg2025/archy/raw/branch/main/releases/manifest.json";
/// Tertiary mirror on a separate OVH VPS — independent network path so
/// a single-provider outage doesn't knock out all three mirrors.
const DEFAULT_TERTIARY_MIRROR_URL: &str =
"http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/releases/manifest.json";
/// Secondary mirror on tx1138 gitea — independent network path so a
/// single-provider outage doesn't knock out both mirrors.
const DEFAULT_SECONDARY_MIRROR_URL: &str =
"https://git.tx1138.com/lfg2025/archy/raw/branch/main/releases/manifest.json";
const UPDATE_STATE_FILE: &str = "update_state.json";
const UPDATE_MIRRORS_FILE: &str = "update-mirrors.json";
/// Marker written by apply_update() just before the service restart and
@@ -110,17 +108,13 @@ fn mirrors_path(data_dir: &Path) -> std::path::PathBuf {
fn default_mirrors() -> Vec<UpdateMirror> {
vec![
UpdateMirror {
url: DEFAULT_SECONDARY_MIRROR_URL.to_string(),
label: "Server 1 (VPS)".to_string(),
},
UpdateMirror {
url: DEFAULT_UPDATE_MANIFEST_URL.to_string(),
label: "Server 2 (tx1138)".to_string(),
label: "Server 1 (OVH)".to_string(),
},
UpdateMirror {
url: DEFAULT_TERTIARY_MIRROR_URL.to_string(),
label: "Server 3 (OVH)".to_string(),
url: DEFAULT_SECONDARY_MIRROR_URL.to_string(),
label: "Server 2 (tx1138)".to_string(),
},
]
}
@@ -150,18 +144,26 @@ pub async fn load_mirrors(data_dir: &Path) -> Result<Vec<UpdateMirror>> {
return Ok(default_mirrors());
}
// One-time migration: the Hetzner VPS at 23.182.128.160 was
// decommissioned 2026-04-23. Existing nodes have it baked into their
// saved mirror list (was the original Server 1). Strip it on load so
// we don't spend seconds per install timing out against a dead host.
// Exception to the usual "explicit removals stick" rule: the user
// never chose to add this — it was a default.
let before = list.len();
list.retain(|m| !m.url.contains("23.182.128.160"));
let mut changed = list.len() != before;
// Merge in any default URLs the saved config is missing.
let known: std::collections::HashSet<String> =
list.iter().map(|m| m.url.clone()).collect();
let known: std::collections::HashSet<String> = list.iter().map(|m| m.url.clone()).collect();
let defaults = default_mirrors();
let mut added = false;
for def in &defaults {
if !known.contains(&def.url) {
list.push(def.clone());
added = true;
changed = true;
}
}
if added {
if changed {
let _ = save_mirrors(data_dir, &list).await;
}
Ok(list)
@@ -188,7 +190,8 @@ pub async fn save_mirrors(data_dir: &Path, mirrors: &[UpdateMirror]) -> Result<(
/// mirror points component downloads back at the same mirror rather
/// than whatever absolute URL the publisher baked in.
fn manifest_origin(manifest_url: &str) -> Option<String> {
let rest = manifest_url.strip_prefix("https://")
let rest = manifest_url
.strip_prefix("https://")
.map(|r| ("https", r))
.or_else(|| manifest_url.strip_prefix("http://").map(|r| ("http", r)))?;
let (scheme, after_scheme) = rest;
@@ -304,13 +307,9 @@ pub struct PendingVerification {
pub deadline_ts: i64,
}
async fn write_pending_verification(
data_dir: &Path,
marker: &PendingVerification,
) -> Result<()> {
async fn write_pending_verification(data_dir: &Path, marker: &PendingVerification) -> Result<()> {
let path = data_dir.join(PENDING_VERIFY_FILE);
let data = serde_json::to_string_pretty(marker)
.context("serialize pending-verify marker")?;
let data = serde_json::to_string_pretty(marker).context("serialize pending-verify marker")?;
fs::write(&path, data)
.await
.with_context(|| format!("write pending-verify marker to {}", path.display()))?;
@@ -404,10 +403,7 @@ pub async fn verify_pending_update(data_dir: &Path) {
attempt += 1;
match probe_frontend_once().await {
Ok(()) => {
info!(
attempt,
"Post-OTA verification succeeded — clearing marker"
);
info!(attempt, "Post-OTA verification succeeded — clearing marker");
clear_pending_verification(data_dir).await;
return;
}
@@ -441,9 +437,7 @@ pub async fn verify_pending_update(data_dir: &Path) {
let _ = host_sudo(&["mv", web_ui_bak.to_str().unwrap_or(""), web_ui]).await;
tracing::info!(quarantined = %quarantine, "Restored web-ui from web-ui.bak");
} else {
tracing::warn!(
"web-ui.bak not present — frontend cannot be rolled back, only binary"
);
tracing::warn!("web-ui.bak not present — frontend cannot be rolled back, only binary");
}
if let Err(e) = rollback_update(data_dir).await {
@@ -478,8 +472,7 @@ pub async fn load_state(data_dir: &Path) -> Result<UpdateState> {
let data = fs::read_to_string(&path)
.await
.context("Reading update state")?;
let mut state: UpdateState =
serde_json::from_str(&data).context("Parsing update state")?;
let mut state: UpdateState = serde_json::from_str(&data).context("Parsing update state")?;
// Keep current_version in sync with the binary. Sideloaded nodes
// (ssh + cp /usr/local/bin/archipelago) don't touch the state file,
@@ -553,7 +546,8 @@ pub async fn check_for_updates(data_dir: &Path) -> Result<UpdateState> {
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
}
match client.get(manifest_url).send().await {
Ok(resp) if resp.status().is_success() => match resp.json::<UpdateManifest>().await {
Ok(resp) if resp.status().is_success() => match resp.json::<UpdateManifest>().await
{
Ok(mut manifest) => {
rewrite_manifest_origins(&mut manifest, manifest_url);
if is_newer(&manifest.version, &state.current_version) {
@@ -1093,26 +1087,15 @@ pub async fn apply_update(data_dir: &Path) -> Result<()> {
if !mk.success() {
anyhow::bail!("mkdir {} failed", staging_new);
}
let extract = host_sudo(&[
"tar",
"-xzf",
&src.to_string_lossy(),
"-C",
&staging_new,
])
.await
.with_context(|| format!("Failed to extract {}", name))?;
let extract =
host_sudo(&["tar", "-xzf", &src.to_string_lossy(), "-C", &staging_new])
.await
.with_context(|| format!("Failed to extract {}", name))?;
if !extract.success() {
let _ = host_sudo(&["rm", "-rf", &staging_new]).await;
anyhow::bail!("tar extraction failed for {}", name);
}
let _ = host_sudo(&[
"chown",
"-R",
"archipelago:archipelago",
&staging_new,
])
.await;
let _ = host_sudo(&["chown", "-R", "archipelago:archipelago", &staging_new]).await;
// Set world-readable perms so nginx (runs as www-data)
// can stat + serve the files. Without this, the tar
@@ -1121,11 +1104,27 @@ pub async fn apply_update(data_dir: &Path) -> Result<()> {
// swap — exactly what bit .116 on the v1.7.38 rollout.
let _ = host_sudo(&["chmod", "755", &staging_new]).await;
let _ = host_sudo(&[
"find", &staging_new, "-type", "d", "-exec", "chmod", "755", "{}", "+",
"find",
&staging_new,
"-type",
"d",
"-exec",
"chmod",
"755",
"{}",
"+",
])
.await;
let _ = host_sudo(&[
"find", &staging_new, "-type", "f", "-exec", "chmod", "644", "{}", "+",
"find",
&staging_new,
"-type",
"f",
"-exec",
"chmod",
"644",
"{}",
"+",
])
.await;
@@ -1167,12 +1166,8 @@ pub async fn apply_update(data_dir: &Path) -> Result<()> {
// old copy as the new rollback.
if Path::new(&staging_old).exists() {
if Path::new(backup_path).exists() {
let _ = host_sudo(&[
"mv",
backup_path,
&format!("{}.{}", backup_path, ts),
])
.await;
let _ = host_sudo(&["mv", backup_path, &format!("{}.{}", backup_path, ts)])
.await;
}
let _ = host_sudo(&["mv", &staging_old, backup_path]).await;
}
@@ -1211,9 +1206,7 @@ pub async fn apply_update(data_dir: &Path) -> Result<()> {
applied_at: chrono::Utc::now().to_rfc3339(),
new_version,
previous_version,
deadline_ts: chrono::Utc::now().timestamp()
+ PENDING_VERIFY_WINDOW_SECS as i64
+ 60,
deadline_ts: chrono::Utc::now().timestamp() + PENDING_VERIFY_WINDOW_SECS as i64 + 60,
};
if let Err(e) = write_pending_verification(data_dir, &marker).await {
tracing::warn!(error = %e, "Failed to write post-OTA verify marker — rollback disabled for this OTA");
@@ -1379,7 +1372,9 @@ pub async fn run_update_scheduler(data_dir: std::path::PathBuf) {
debug!("Update scheduler: apply failed: {}", e);
continue;
}
info!("Update scheduler: update applied, restart scheduled by apply_update");
info!(
"Update scheduler: update applied, restart scheduled by apply_update"
);
// apply_update has already spawned a 2s-delayed
// `systemctl restart archipelago`. Don't call
// std::process::exit here — that kills the runtime
@@ -1414,7 +1409,9 @@ mod tests {
#[test]
fn test_manifest_origin_parses_https() {
assert_eq!(
manifest_origin("https://git.tx1138.com/lfg2025/archy/raw/branch/main/releases/manifest.json"),
manifest_origin(
"https://git.tx1138.com/lfg2025/archy/raw/branch/main/releases/manifest.json"
),
Some("https://git.tx1138.com".to_string())
);
}
@@ -1422,7 +1419,9 @@ mod tests {
#[test]
fn test_manifest_origin_parses_http_with_port() {
assert_eq!(
manifest_origin("http://23.182.128.160:3000/lfg2025/archy/raw/branch/main/releases/manifest.json"),
manifest_origin(
"http://23.182.128.160:3000/lfg2025/archy/raw/branch/main/releases/manifest.json"
),
Some("http://23.182.128.160:3000".to_string())
);
}
@@ -1458,7 +1457,10 @@ mod tests {
},
],
};
rewrite_manifest_origins(&mut manifest, "http://23.182.128.160:3000/lfg2025/archy/raw/branch/main/releases/manifest.json");
rewrite_manifest_origins(
&mut manifest,
"http://23.182.128.160:3000/lfg2025/archy/raw/branch/main/releases/manifest.json",
);
assert_eq!(
manifest.components[0].download_url,
"http://23.182.128.160:3000/lfg2025/archy/raw/branch/main/releases/v1.7.26-alpha/archipelago"
@@ -1473,10 +1475,9 @@ mod tests {
async fn test_load_mirrors_returns_defaults_when_absent() {
let dir = tempfile::tempdir().unwrap();
let list = load_mirrors(dir.path()).await.unwrap();
assert_eq!(list.len(), 3);
assert!(list[0].url.contains("23.182.128.160"));
assert_eq!(list.len(), 2);
assert!(list[0].url.contains("146.59.87.168"));
assert!(list[1].url.contains("git.tx1138.com"));
assert!(list[2].url.contains("146.59.87.168"));
}
#[tokio::test]
@@ -1488,7 +1489,22 @@ mod tests {
}];
save_mirrors(dir.path(), &list).await.unwrap();
let back = load_mirrors(dir.path()).await.unwrap();
assert_eq!(back, list);
// load_mirrors merges in any missing default mirrors so a node
// that explicitly added a single custom mirror still gets the
// built-in OVH + tx1138 fallbacks. The custom mirror is preserved.
assert!(
back.iter().any(|m| m.url == "https://example.com/m.json"),
"custom mirror should round-trip; got {:?}",
back
);
for def in default_mirrors() {
assert!(
back.iter().any(|m| m.url == def.url),
"default mirror {} should be present after load; got {:?}",
def.url,
back
);
}
}
#[test]
@@ -1701,7 +1717,9 @@ mod tests {
previous_version: "1.7.40-alpha".into(),
deadline_ts: chrono::Utc::now().timestamp() + 150,
};
write_pending_verification(dir.path(), &marker).await.unwrap();
write_pending_verification(dir.path(), &marker)
.await
.unwrap();
let read = read_pending_verification(dir.path()).await.unwrap();
assert_eq!(read.new_version, "1.7.41-alpha");
assert_eq!(read.previous_version, "1.7.40-alpha");

View File

@@ -334,7 +334,9 @@ mod tests {
amount: 1,
id: "test".into(),
secret: "s".into(),
c: "02a9acc1e48c25eeeb9289b5031cc57da9fe72f3fe2861d94ec4da0e7f6c2b4e24".to_string(),
// Generator point G of secp256k1, compressed form. Always a
// valid pubkey, so c_as_pubkey() must succeed.
c: "0279be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798".to_string(),
};
assert!(proof.c_as_pubkey().is_ok());
}

View File

@@ -213,9 +213,16 @@ mod tests {
version: "1.0.0".to_string(),
description: None,
container: ContainerConfig {
image: format!("test/{}:latest", id),
image: Some(format!("test/{}:latest", id)),
image_signature: None,
pull_policy: "if-not-present".to_string(),
build: None,
network: None,
custom_args: vec![],
entrypoint: None,
derived_env: vec![],
secret_env: vec![],
data_uid: None,
},
dependencies: deps,
resources: Default::default(),

View File

@@ -9,7 +9,11 @@ pub mod runtime;
pub use bitcoin_simulator::{BitcoinSimulationMode, BitcoinSimulator};
pub use dependency_resolver::DependencyResolver;
pub use health_monitor::HealthMonitor;
pub use manifest::{AppManifest, Dependency, HealthCheck, ResourceLimits, SecurityPolicy};
pub use manifest::{
AppManifest, BuildConfig, ContainerConfig, Dependency, DerivedEnv, HealthCheck, HostFacts,
ManifestError, ResolvedSource, ResourceLimits, SecretEnv, SecretsProvider, SecurityPolicy,
Volume,
};
pub use podman_client::{ContainerState, ContainerStatus, PodmanClient};
pub use port_manager::{PortError, PortManager};
pub use runtime::{AutoRuntime, ContainerRuntime, DockerRuntime, PodmanRuntime};

View File

@@ -57,17 +57,136 @@ pub struct AppDefinition {
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct ContainerConfig {
pub image: String,
/// Pull source. Mutually exclusive with `build`. Exactly one of the two must be present.
#[serde(default)]
pub image: Option<String>,
#[serde(default)]
pub image_signature: Option<String>,
#[serde(default = "default_pull_policy")]
pub pull_policy: String,
/// Local build source. Mutually exclusive with `image`.
#[serde(default)]
pub build: Option<BuildConfig>,
// ── Step 8b.0 additions ──────────────────────────────────────────
//
// Fields the Rust orchestrator needs to faithfully port containers
// from the legacy `scripts/container-specs.sh` registry. See
// `docs/STEP-8B-PORT-AUDIT.md` for the full justification per field.
//
// All are optional with `#[serde(default)]` so every existing manifest
// in `apps/` continues to parse unchanged.
/// Podman `--network` value. `Some("archy-net")` joins the shared
/// Archipelago bridge. `Some("host")` uses host networking.
/// `None` (the default) falls back to podman's default isolated
/// network — equivalent to today's rootless default.
///
/// `SecurityPolicy::network_policy` remains a policy knob (what the
/// firewall layer does); this field is literally the CLI flag value.
#[serde(default)]
pub network: Option<String>,
/// Extra positional arguments appended to the container command
/// after the image. Mirrors `SPEC_CUSTOM_ARGS` in
/// `scripts/container-specs.sh` (bitcoin-knots prune/dbcache flags,
/// filebrowser `--config /data/.filebrowser.json`, etc).
#[serde(default)]
pub custom_args: Vec<String>,
/// Entrypoint override (`podman run --entrypoint …`). When present,
/// replaces the image's default entrypoint. Mirrors `SPEC_ENTRYPOINT`
/// for fedimint-gateway's LND-aware invocation.
#[serde(default)]
pub entrypoint: Option<Vec<String>>,
/// Environment keys whose values are rendered from a small
/// allow-list of host facts (`HOST_IP`, `HOST_MDNS`, `DISK_GB`).
/// Resolved by `ContainerConfig::resolve_derived_env` at apply time
/// — never hard-coded into the manifest.
///
/// Example: `- { key: FM_P2P_URL, template: "fedimint://{{HOST_MDNS}}:8173" }`
#[serde(default)]
pub derived_env: Vec<DerivedEnv>,
/// Environment keys whose values are read from files in
/// `/var/lib/archipelago/secrets/<secret_file>`. Never logged.
/// Resolved by `ContainerConfig::resolve_secret_env` at apply time.
///
/// Example: `- { key: FM_BITCOIND_PASSWORD, secret_file: bitcoin-rpc-password }`
#[serde(default)]
pub secret_env: Vec<SecretEnv>,
/// Rootless-mapped UID:GID applied to the container's data directory
/// (the `bind`-mounted host path with `target` inside the container's
/// data root) before creation. Mirrors `SPEC_DATA_UID`.
///
/// Example: `"100070:100070"` for Postgres' mapped subuid.
#[serde(default)]
pub data_uid: Option<String>,
}
/// Derived-env entry. The template is rendered against `HostFacts` at
/// apply time; exactly one `{{PLACEHOLDER}}` occurrence per supported
/// fact name is allowed (host_ip, host_mdns, disk_gb).
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct DerivedEnv {
pub key: String,
pub template: String,
}
/// Secret-env entry. `secret_file` is resolved against a
/// `SecretsProvider` (in prod, `/var/lib/archipelago/secrets/`).
///
/// `secret_file` is restricted to a bare filename — no `/`, no `..`.
/// Validated at `AppManifest::validate` time.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct SecretEnv {
pub key: String,
pub secret_file: String,
}
fn default_pull_policy() -> String {
"if-not-present".to_string()
}
/// Build a container image locally from a Dockerfile rather than pulling from a registry.
///
/// When present on `ContainerConfig`, the orchestrator runs `podman build -t <tag> -f <dockerfile> <context>`
/// before starting the container. The resulting local image is referenced by `tag`.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct BuildConfig {
/// Build context directory (absolute path or relative to the manifest location).
pub context: String,
/// Dockerfile path relative to `context`. Defaults to `Dockerfile`.
#[serde(default = "default_dockerfile")]
pub dockerfile: String,
/// Tag applied to the built image. Used as the container's image reference.
pub tag: String,
/// Optional `--build-arg KEY=VALUE` pairs passed to the build.
#[serde(default)]
pub build_args: HashMap<String, String>,
}
fn default_dockerfile() -> String {
"Dockerfile".to_string()
}
/// Resolved pull-or-build decision after manifest validation.
///
/// `ContainerConfig::resolve()` produces this. The orchestrator matches on it
/// to decide whether to pull a registry image or invoke a local build.
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum ResolvedSource {
/// Pull `image` from a registry using `pull_policy` semantics.
Pull {
image: String,
pull_policy: String,
image_signature: Option<String>,
},
/// Build locally. The resulting tag is the image reference for `podman create`.
Build(BuildConfig),
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(untagged)]
pub enum Dependency {
@@ -133,10 +252,15 @@ impl From<(u16, u16)> for PortMapping {
pub struct Volume {
#[serde(rename = "type")]
pub volume_type: String,
#[serde(default)]
pub source: String,
pub target: String,
#[serde(default)]
pub options: Vec<String>,
/// For `type: tmpfs` only. Comma-separated mount options
/// (e.g. `"rw,noexec,nosuid,size=256m"`). Ignored for bind/volume.
#[serde(default)]
pub tmpfs_options: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
@@ -182,10 +306,33 @@ impl AppManifest {
return Err(ManifestError::Invalid("app.id cannot be empty".to_string()));
}
if self.app.container.image.is_empty() {
return Err(ManifestError::Invalid(
"container.image cannot be empty".to_string(),
));
// Exactly one of container.image or container.build must be set. We can't
// default either side, because an empty-string image or an empty build block
// would be silently wrong downstream.
match (&self.app.container.image, &self.app.container.build) {
(Some(img), None) if !img.is_empty() => {}
(None, Some(b)) => {
if b.context.is_empty() {
return Err(ManifestError::Invalid(
"container.build.context cannot be empty".to_string(),
));
}
if b.tag.is_empty() {
return Err(ManifestError::Invalid(
"container.build.tag cannot be empty".to_string(),
));
}
}
(Some(_), Some(_)) => {
return Err(ManifestError::Invalid(
"container.image and container.build are mutually exclusive".to_string(),
));
}
_ => {
return Err(ManifestError::Invalid(
"container must specify either image or build".to_string(),
));
}
}
// Validate version format (semantic versioning)
@@ -195,13 +342,295 @@ impl AppManifest {
));
}
// ── Step 8b.0 field validation ────────────────────────────────
// network: allow any non-empty string; podman itself is the
// final authority (named networks, "host", "bridge", "none",
// "container:<name>", etc). Reject only the empty-string case
// so "network:" with no value is a loud error instead of a
// silent default.
if let Some(n) = &self.app.container.network {
if n.is_empty() {
return Err(ManifestError::Invalid(
"container.network cannot be empty (omit the field to use default)".to_string(),
));
}
}
// custom_args: no empty strings (would inject literal "" into
// the podman command line and confuse downstream parsing).
for (i, a) in self.app.container.custom_args.iter().enumerate() {
if a.is_empty() {
return Err(ManifestError::Invalid(format!(
"container.custom_args[{i}] cannot be empty"
)));
}
}
// entrypoint: present ⇒ non-empty vec, no empty elements.
if let Some(ep) = &self.app.container.entrypoint {
if ep.is_empty() {
return Err(ManifestError::Invalid(
"container.entrypoint must contain at least one element when set".to_string(),
));
}
for (i, a) in ep.iter().enumerate() {
if a.is_empty() {
return Err(ManifestError::Invalid(format!(
"container.entrypoint[{i}] cannot be empty"
)));
}
}
}
// derived_env: non-empty keys, unique keys, templates reference
// only known host-fact placeholders.
{
let mut seen: std::collections::HashSet<&str> = std::collections::HashSet::new();
for (i, e) in self.app.container.derived_env.iter().enumerate() {
if e.key.is_empty() {
return Err(ManifestError::Invalid(format!(
"container.derived_env[{i}].key cannot be empty"
)));
}
if !seen.insert(e.key.as_str()) {
return Err(ManifestError::Invalid(format!(
"container.derived_env has duplicate key '{}'",
e.key
)));
}
validate_derived_template(&e.key, &e.template)?;
}
}
// secret_env: non-empty keys, unique keys, secret_file is a
// bare filename (no '/', no '..').
{
let mut seen: std::collections::HashSet<&str> = std::collections::HashSet::new();
for (i, e) in self.app.container.secret_env.iter().enumerate() {
if e.key.is_empty() {
return Err(ManifestError::Invalid(format!(
"container.secret_env[{i}].key cannot be empty"
)));
}
if !seen.insert(e.key.as_str()) {
return Err(ManifestError::Invalid(format!(
"container.secret_env has duplicate key '{}'",
e.key
)));
}
if e.secret_file.is_empty()
|| e.secret_file.contains('/')
|| e.secret_file.contains("..")
{
return Err(ManifestError::Invalid(format!(
"container.secret_env[{}].secret_file must be a bare filename (no '/', no '..'), got '{}'",
i, e.secret_file
)));
}
}
}
// data_uid: if set, must look like "NNNNN:NNNNN".
if let Some(u) = &self.app.container.data_uid {
let parts: Vec<&str> = u.split(':').collect();
let valid = parts.len() == 2
&& !parts[0].is_empty()
&& !parts[1].is_empty()
&& parts[0].chars().all(|c| c.is_ascii_digit())
&& parts[1].chars().all(|c| c.is_ascii_digit());
if !valid {
return Err(ManifestError::Invalid(format!(
"container.data_uid must be 'UID:GID' with numeric parts, got '{}'",
u
)));
}
}
// Volume tmpfs_options: only meaningful for type: tmpfs.
for (i, v) in self.app.volumes.iter().enumerate() {
if v.volume_type == "tmpfs" {
if v.target.is_empty() {
return Err(ManifestError::Invalid(format!(
"volumes[{i}] (tmpfs) must set target"
)));
}
if !v.source.is_empty() {
return Err(ManifestError::Invalid(format!(
"volumes[{i}] (tmpfs) must not set source"
)));
}
} else if v.tmpfs_options.is_some() {
return Err(ManifestError::Invalid(format!(
"volumes[{i}] sets tmpfs_options but type is '{}', not 'tmpfs'",
v.volume_type
)));
} else {
if v.source.is_empty() {
return Err(ManifestError::Invalid(format!(
"volumes[{i}] ({}) must set source",
v.volume_type
)));
}
if v.target.is_empty() {
return Err(ManifestError::Invalid(format!(
"volumes[{i}] ({}) must set target",
v.volume_type
)));
}
}
}
Ok(())
}
}
/// Host facts available to `derived_env` templates at apply time.
///
/// Mirrors the values `scripts/container-specs.sh:detect_environment()`
/// computed before each reconcile pass. The Rust orchestrator computes
/// these once per reconcile tick and passes them to
/// `ContainerConfig::resolve_derived_env`.
#[derive(Debug, Clone)]
pub struct HostFacts {
/// Primary host IPv4 (e.g. from `hostname -I | awk '{print $1}'`).
/// Falls back to `127.0.0.1` on detection failure.
pub host_ip: String,
/// mDNS hostname (`<hostname>.local`). Survives DHCP churn and
/// reinstall-on-different-IP. Requires avahi-daemon on the node.
pub host_mdns: String,
/// Usable disk size in gigabytes at `/var/lib/archipelago` (or
/// `/` if the data partition is not yet mounted). Drives the
/// prune-vs-full-node decision in bitcoin-knots custom_args.
pub disk_gb: u64,
}
impl HostFacts {
/// Test-only constant fixture; do not use in production paths.
#[cfg(test)]
pub fn sample() -> Self {
Self {
host_ip: "192.168.1.116".to_string(),
host_mdns: "archi-thinkpad.local".to_string(),
disk_gb: 2000,
}
}
}
/// Supported placeholder names in `DerivedEnv::template`. Keep in sync
/// with `HostFacts`. Centralized so validation and rendering agree.
const DERIVED_PLACEHOLDERS: &[&str] = &["HOST_IP", "HOST_MDNS", "DISK_GB"];
fn validate_derived_template(key: &str, template: &str) -> Result<(), ManifestError> {
// Walk `{{NAME}}` occurrences and ensure each NAME is recognized.
// Unbalanced braces are a user error.
let bytes = template.as_bytes();
let mut i = 0;
while i + 1 < bytes.len() {
if bytes[i] == b'{' && bytes[i + 1] == b'{' {
let rest = &template[i + 2..];
let close = rest.find("}}").ok_or_else(|| {
ManifestError::Invalid(format!(
"container.derived_env['{key}'].template has unbalanced '{{{{' — no closing '}}}}'"
))
})?;
let name = &rest[..close];
if !DERIVED_PLACEHOLDERS.contains(&name) {
return Err(ManifestError::Invalid(format!(
"container.derived_env['{key}'].template references unknown placeholder '{{{{{name}}}}}' (supported: {})",
DERIVED_PLACEHOLDERS.join(", ")
)));
}
i = i + 2 + close + 2;
} else {
i += 1;
}
}
Ok(())
}
/// A source of named secrets. In prod this is a directory on disk
/// (`/var/lib/archipelago/secrets/`); in tests, a HashMap.
pub trait SecretsProvider {
/// Read the named secret and return its value with trailing
/// whitespace trimmed (so `echo "…" > secret-file` works without
/// injecting a newline into env).
fn read(&self, name: &str) -> Result<String, ManifestError>;
}
impl ContainerConfig {
/// Collapse the (image, build) pair into a single resolved source.
///
/// Returns `None` if the config is in an invalid state (e.g. neither field set
/// or both set). Callers should have already run `AppManifest::validate()` to
/// surface a user-facing error; this method is for internal orchestrator use
/// after validation has passed.
pub fn resolve(&self) -> Option<ResolvedSource> {
match (&self.image, &self.build) {
(Some(img), None) if !img.is_empty() => Some(ResolvedSource::Pull {
image: img.clone(),
pull_policy: self.pull_policy.clone(),
image_signature: self.image_signature.clone(),
}),
(None, Some(b)) => Some(ResolvedSource::Build(b.clone())),
_ => None,
}
}
/// The image reference used to create/inspect a container for this config.
///
/// For Pull sources this is the registry image. For Build sources this is
/// the locally-built tag. Returns `None` only for an invalid config.
pub fn image_ref(&self) -> Option<String> {
self.resolve().map(|r| match r {
ResolvedSource::Pull { image, .. } => image,
ResolvedSource::Build(b) => b.tag,
})
}
/// Render every `derived_env` entry's template against the given
/// host facts. Returns `"KEY=VALUE"` strings ready to concatenate
/// with `environment:`.
///
/// Assumes `AppManifest::validate()` has already accepted the
/// manifest — placeholder names are not re-checked here.
pub fn resolve_derived_env(&self, facts: &HostFacts) -> Vec<String> {
self.derived_env
.iter()
.map(|e| {
let value = e
.template
.replace("{{HOST_IP}}", &facts.host_ip)
.replace("{{HOST_MDNS}}", &facts.host_mdns)
.replace("{{DISK_GB}}", &facts.disk_gb.to_string());
format!("{}={}", e.key, value)
})
.collect()
}
/// Read every `secret_env` entry's value from the provider and
/// return `"KEY=VALUE"` strings. Propagates the provider error on
/// the first missing/unreadable secret — partial resolution is not
/// useful because it silently produces a misconfigured container.
pub fn resolve_secret_env(
&self,
provider: &dyn SecretsProvider,
) -> Result<Vec<String>, ManifestError> {
let mut out = Vec::with_capacity(self.secret_env.len());
for e in &self.secret_env {
let v = provider.read(&e.secret_file)?;
out.push(format!("{}={}", e.key, v));
}
Ok(out)
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::collections::HashMap;
use std::fs;
use std::path::{Path, PathBuf};
#[test]
fn test_manifest_parse() {
@@ -234,4 +663,464 @@ app:
let result = AppManifest::parse(yaml);
assert!(result.is_err());
}
#[test]
fn pull_source_resolves_to_pull() {
let yaml = r#"
app:
id: test-app
name: Test
version: 1.0.0
container:
image: docker.io/library/nginx:1.27
pull_policy: always
"#;
let m = AppManifest::parse(yaml).unwrap();
let src = m.app.container.resolve().unwrap();
match src {
ResolvedSource::Pull {
image, pull_policy, ..
} => {
assert_eq!(image, "docker.io/library/nginx:1.27");
assert_eq!(pull_policy, "always");
}
_ => panic!("expected Pull"),
}
assert_eq!(
m.app.container.image_ref().as_deref(),
Some("docker.io/library/nginx:1.27")
);
}
#[test]
fn build_source_resolves_to_build() {
let yaml = r#"
app:
id: bitcoin-ui
name: Bitcoin UI
version: 1.0.0
container:
build:
context: /opt/archipelago/docker/bitcoin-ui
dockerfile: Dockerfile
tag: archy-bitcoin-ui:local
build_args:
NGINX_VERSION: "1.27"
"#;
let m = AppManifest::parse(yaml).unwrap();
let src = m.app.container.resolve().unwrap();
match src {
ResolvedSource::Build(b) => {
assert_eq!(b.context, "/opt/archipelago/docker/bitcoin-ui");
assert_eq!(b.dockerfile, "Dockerfile");
assert_eq!(b.tag, "archy-bitcoin-ui:local");
assert_eq!(b.build_args.get("NGINX_VERSION").unwrap(), "1.27");
}
_ => panic!("expected Build"),
}
assert_eq!(
m.app.container.image_ref().as_deref(),
Some("archy-bitcoin-ui:local")
);
}
#[test]
fn dockerfile_defaults_to_dockerfile() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
build:
context: /tmp
tag: x:local
"#;
let m = AppManifest::parse(yaml).unwrap();
match m.app.container.resolve().unwrap() {
ResolvedSource::Build(b) => assert_eq!(b.dockerfile, "Dockerfile"),
_ => unreachable!(),
}
}
#[test]
fn image_and_build_both_set_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
image: foo:latest
build:
context: /tmp
tag: x:local
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(
msg.contains("mutually exclusive"),
"unexpected error: {msg}"
);
}
#[test]
fn neither_image_nor_build_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container: {}
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(
msg.contains("either image or build"),
"unexpected error: {msg}"
);
}
#[test]
fn empty_image_string_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
image: ""
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(
msg.contains("either image or build"),
"unexpected error: {msg}"
);
}
#[test]
fn empty_build_context_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
build:
context: ""
tag: x:local
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("context"), "unexpected error: {msg}");
}
#[test]
fn empty_build_tag_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
build:
context: /tmp
tag: ""
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("tag"), "unexpected error: {msg}");
}
#[test]
fn existing_pull_only_manifests_still_parse() {
// Backwards-compat smoke: the shape every file in apps/*/manifest.yml uses today.
let yaml = r#"
app:
id: legacy
name: Legacy App
version: 0.1.0
description: existing shape
container:
image: registry.example.com/legacy:1.2.3
image_signature: sha256:abc
ports:
- { host: 8080, container: 80 }
"#;
let m = AppManifest::parse(yaml).unwrap();
assert_eq!(m.app.container.pull_policy, "if-not-present");
matches!(
m.app.container.resolve().unwrap(),
ResolvedSource::Pull { .. }
);
}
#[test]
fn empty_custom_arg_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
image: foo:latest
custom_args: [""]
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("custom_args[0]"), "unexpected error: {msg}");
}
#[test]
fn empty_entrypoint_vec_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
image: foo:latest
entrypoint: []
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("entrypoint"), "unexpected error: {msg}");
}
#[test]
fn empty_entrypoint_element_is_rejected() {
let yaml = r#"
app:
id: x
name: X
version: 1.0.0
container:
image: foo:latest
entrypoint: ["gatewayd", ""]
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("entrypoint[1]"), "unexpected error: {msg}");
}
#[test]
fn duplicate_derived_env_keys_are_rejected() {
let yaml = r#"
app:
id: fedimint
name: Fedimint
version: 0.10.0
container:
image: fedimintd:v0.10.0
derived_env:
- key: FM_API_URL
template: "ws://{{HOST_MDNS}}:8174"
- key: FM_API_URL
template: "ws://{{HOST_IP}}:8174"
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("duplicate key"), "unexpected error: {msg}");
}
#[test]
fn unknown_derived_placeholder_is_rejected() {
let yaml = r#"
app:
id: fedimint
name: Fedimint
version: 0.10.0
container:
image: fedimintd:v0.10.0
derived_env:
- key: FM_API_URL
template: "ws://{{HOSTNAME}}:8174"
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(
msg.contains("unknown placeholder"),
"unexpected error: {msg}"
);
}
#[test]
fn path_traversal_secret_file_is_rejected() {
let yaml = r#"
app:
id: fedimint
name: Fedimint
version: 0.10.0
container:
image: fedimintd:v0.10.0
secret_env:
- key: FM_BITCOIND_PASSWORD
secret_file: "../bitcoin-rpc-password"
"#;
let err = AppManifest::parse(yaml).unwrap_err();
let msg = format!("{err}");
assert!(msg.contains("bare filename"), "unexpected error: {msg}");
}
#[test]
fn resolve_derived_env_renders_host_facts() {
let c = ContainerConfig {
image: Some("x:latest".to_string()),
image_signature: None,
pull_policy: "if-not-present".to_string(),
build: None,
network: None,
custom_args: vec![],
entrypoint: None,
derived_env: vec![
DerivedEnv {
key: "FM_API_URL".to_string(),
template: "ws://{{HOST_MDNS}}:8174".to_string(),
},
DerivedEnv {
key: "INFO".to_string(),
template: "{{HOST_IP}}-{{DISK_GB}}".to_string(),
},
],
secret_env: vec![],
data_uid: None,
};
let facts = HostFacts {
host_ip: "192.168.1.116".to_string(),
host_mdns: "archi-thinkpad.local".to_string(),
disk_gb: 2000,
};
let out = c.resolve_derived_env(&facts);
assert_eq!(out[0], "FM_API_URL=ws://archi-thinkpad.local:8174");
assert_eq!(out[1], "INFO=192.168.1.116-2000");
}
struct MapSecretsProvider {
data: HashMap<String, String>,
}
impl SecretsProvider for MapSecretsProvider {
fn read(&self, name: &str) -> Result<String, ManifestError> {
self.data
.get(name)
.cloned()
.ok_or_else(|| ManifestError::Invalid(format!("missing secret: {name}")))
}
}
#[test]
fn resolve_secret_env_reads_from_provider() {
let c = ContainerConfig {
image: Some("x:latest".to_string()),
image_signature: None,
pull_policy: "if-not-present".to_string(),
build: None,
network: None,
custom_args: vec![],
entrypoint: None,
derived_env: vec![],
secret_env: vec![
SecretEnv {
key: "FM_BITCOIND_PASSWORD".to_string(),
secret_file: "bitcoin-rpc-password".to_string(),
},
SecretEnv {
key: "FM_GATEWAY_PASSWORD".to_string(),
secret_file: "fedimint-gateway-password".to_string(),
},
],
data_uid: None,
};
let p = MapSecretsProvider {
data: HashMap::from([
(
"bitcoin-rpc-password".to_string(),
"supersecret1".to_string(),
),
(
"fedimint-gateway-password".to_string(),
"supersecret2".to_string(),
),
]),
};
let out = c.resolve_secret_env(&p).unwrap();
assert_eq!(out[0], "FM_BITCOIND_PASSWORD=supersecret1");
assert_eq!(out[1], "FM_GATEWAY_PASSWORD=supersecret2");
}
#[test]
fn parse_every_real_manifest() {
let app_manifests = list_repo_manifests();
assert!(
!app_manifests.is_empty(),
"no apps/*/manifest.yml files found"
);
let mut failures: Vec<String> = Vec::new();
let mut modern_count = 0usize;
let mut legacy_count = 0usize;
for path in app_manifests {
let content = fs::read_to_string(&path).expect("read manifest");
let parsed_yaml: serde_yaml::Value = match serde_yaml::from_str(&content) {
Ok(v) => v,
Err(err) => {
failures.push(format!("{}: YAML parse error: {err}", path.display()));
continue;
}
};
let is_modern = parsed_yaml
.as_mapping()
.map(|m| m.contains_key(serde_yaml::Value::String("app".to_string())))
.unwrap_or(false);
if is_modern {
modern_count += 1;
if let Err(err) = AppManifest::parse(&content) {
failures.push(format!("{}: {err}", path.display()));
}
} else {
legacy_count += 1;
}
}
assert!(modern_count > 0, "no modern app-schema manifests found");
assert!(
legacy_count > 0,
"expected at least one legacy manifest shape"
);
assert!(
failures.is_empty(),
"manifest parse failures:\n{}",
failures.join("\n")
);
}
fn list_repo_manifests() -> Vec<PathBuf> {
let repo_root = Path::new(env!("CARGO_MANIFEST_DIR")).join("..").join("..");
let apps_dir = repo_root.join("apps");
let mut out = Vec::new();
let Ok(entries) = fs::read_dir(apps_dir) else {
return out;
};
for entry in entries.flatten() {
let path = entry.path();
if !path.is_dir() {
continue;
}
let manifest = path.join("manifest.yml");
if manifest.exists() {
out.push(manifest);
}
}
out.sort();
out
}
}

View File

@@ -126,7 +126,7 @@ impl PodmanClient {
"filebrowser" => "http://localhost:8083",
"nginx-proxy-manager" => "http://localhost:81",
"portainer" => "http://localhost:9000",
"uptime-kuma" => "http://localhost:3001",
"uptime-kuma" => "http://localhost:3002",
"fedimint" | "fedimintd" => "http://localhost:8175",
"fedimint-gateway" => "http://localhost:8176",
"nostr-rs-relay" => "http://localhost:18081",
@@ -288,12 +288,29 @@ impl PodmanClient {
let mut mounts = Vec::new();
for volume in &manifest.app.volumes {
mounts.push(serde_json::json!({
"destination": volume.target,
"source": volume.source,
"type": "bind",
"options": volume.options,
}));
if volume.volume_type == "tmpfs" {
let options: Vec<String> = volume
.tmpfs_options
.as_deref()
.unwrap_or("")
.split(',')
.map(str::trim)
.filter(|s| !s.is_empty())
.map(|s| s.to_string())
.collect();
mounts.push(serde_json::json!({
"destination": volume.target,
"type": "tmpfs",
"options": options,
}));
} else {
mounts.push(serde_json::json!({
"destination": volume.target,
"source": volume.source,
"type": "bind",
"options": volume.options,
}));
}
}
let mut env_map = serde_json::Map::new();
@@ -306,29 +323,66 @@ impl PodmanClient {
let cap_add: Vec<String> = manifest.app.security.capabilities.clone();
let cap_drop = vec!["ALL".to_string()];
let image_ref = manifest.app.container.image_ref().ok_or_else(|| {
anyhow::anyhow!(
"container config for {} has neither a valid image nor build source",
manifest.app.id
)
})?;
// Build resource_limits conditionally: if the manifest has no memory or
// cpu limit, OMIT the field entirely rather than sending 0. The podman
// libpod HTTP API treats `memory.limit: 0` as "set MemoryMax=0" which
// systemd then rejects at container-start time. Absent = unlimited.
let mut resource_limits = serde_json::Map::new();
if let Some(mem_bytes) = manifest
.app
.resources
.memory_limit
.as_ref()
.and_then(|m| parse_memory_limit(m))
{
resource_limits.insert(
"memory".to_string(),
serde_json::json!({ "limit": mem_bytes }),
);
}
if let Some(cpu) = manifest.app.resources.cpu_limit {
resource_limits.insert(
"cpu".to_string(),
serde_json::json!({
"quota": (cpu as i64) * 100_000,
"period": 100_000u64,
}),
);
}
let net_mode = if let Some(n) = manifest.app.container.network.as_ref() {
if n.is_empty() {
"bridge"
} else {
n.as_str()
}
} else {
match manifest.app.security.network_policy.as_str() {
"host" => "host",
_ => "bridge",
}
};
let body = serde_json::json!({
"name": name,
"image": manifest.app.container.image,
"image": image_ref,
"portmappings": port_mappings,
"mounts": mounts,
"env": env_map,
"entrypoint": manifest.app.container.entrypoint.clone(),
"command": manifest.app.container.custom_args.clone(),
"hostadd": ["host.containers.internal:host-gateway"],
"devices": manifest.app.devices.iter().map(|d| {
serde_json::json!({"path": d})
}).collect::<Vec<_>>(),
"resource_limits": {
"memory": {
"limit": manifest.app.resources.memory_limit.as_ref()
.and_then(|m| parse_memory_limit(m))
.unwrap_or(0),
},
"cpu": {
"quota": manifest.app.resources.cpu_limit
.map(|c| (c as i64) * 100000)
.unwrap_or(0),
"period": 100000u64,
}
},
"resource_limits": resource_limits,
"cap_add": cap_add,
"cap_drop": cap_drop,
"read_only_filesystem": manifest.app.security.readonly_root,
@@ -336,10 +390,7 @@ impl PodmanClient {
"restart_policy": "unless-stopped",
"restart_tries": 5,
"netns": {
"nsmode": match manifest.app.security.network_policy.as_str() {
"host" => "host",
_ => "bridge",
}
"nsmode": net_mode
},
});
@@ -571,26 +622,106 @@ fn parse_port_bindings(bindings: &serde_json::Value) -> Vec<String> {
}
fn parse_memory_limit(limit: &str) -> Option<i64> {
let limit = limit.trim().to_lowercase();
if limit.ends_with('g') {
limit
.trim_end_matches('g')
.parse::<f64>()
.ok()
.map(|v| (v * 1_073_741_824.0) as i64)
} else if limit.ends_with('m') {
limit
.trim_end_matches('m')
.parse::<f64>()
.ok()
.map(|v| (v * 1_048_576.0) as i64)
} else if limit.ends_with('k') {
limit
.trim_end_matches('k')
.parse::<f64>()
.ok()
.map(|v| (v * 1024.0) as i64)
} else {
limit.parse::<i64>().ok()
// Supports the Kubernetes-style suffixes used throughout apps/*/manifest.yml
// (IEC binary: Ki/Mi/Gi/Ti) as well as the shorter docker-style k/m/g/t.
// Longest suffix matched first so "Mi" isn't mis-matched as "m".
//
// Historical bug: we used to lowercase+trim_end_matches('m'), which turned
// "128Mi" into "128i" → parse::<f64> failed → None → .unwrap_or(0) wrote
// memory.limit:0 into the OCI spec, which systemd then rejected at start
// time with "MemoryMax is out of range" on rootless podman. See
// docs/rust-orchestrator-migration.md Step 9 notes.
let trimmed = limit.trim();
if trimmed.is_empty() {
return None;
}
const UNITS: &[(&str, i64)] = &[
("Ki", 1024),
("Mi", 1024 * 1024),
("Gi", 1024 * 1024 * 1024),
("Ti", 1024i64 * 1024 * 1024 * 1024),
("kB", 1000),
("MB", 1_000_000),
("GB", 1_000_000_000),
("TB", 1_000_000_000_000),
("k", 1024),
("K", 1024),
("m", 1024 * 1024),
("M", 1024 * 1024),
("g", 1024 * 1024 * 1024),
("G", 1024 * 1024 * 1024),
("t", 1024i64 * 1024 * 1024 * 1024),
("T", 1024i64 * 1024 * 1024 * 1024),
("b", 1),
("B", 1),
];
for (suffix, multiplier) in UNITS {
if let Some(num) = trimmed.strip_suffix(suffix) {
let num = num.trim();
return num
.parse::<f64>()
.ok()
.map(|v| (v * (*multiplier as f64)) as i64)
.filter(|n| *n > 0);
}
}
// No recognised suffix — treat as raw bytes.
trimmed.parse::<i64>().ok().filter(|n| *n > 0)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parse_memory_limit_iec_binary_suffixes() {
// Kubernetes-style — this is what apps/*/manifest.yml uses.
assert_eq!(parse_memory_limit("128Mi"), Some(128 * 1024 * 1024));
assert_eq!(parse_memory_limit("64Mi"), Some(64 * 1024 * 1024));
assert_eq!(parse_memory_limit("4Gi"), Some(4i64 * 1024 * 1024 * 1024));
assert_eq!(parse_memory_limit("512Ki"), Some(512 * 1024));
}
#[test]
fn parse_memory_limit_shorthand_suffixes() {
// Docker-style shorthand — treated as IEC binary for backwards compat.
assert_eq!(parse_memory_limit("128m"), Some(128 * 1024 * 1024));
assert_eq!(parse_memory_limit("128M"), Some(128 * 1024 * 1024));
assert_eq!(parse_memory_limit("2g"), Some(2i64 * 1024 * 1024 * 1024));
assert_eq!(parse_memory_limit("2G"), Some(2i64 * 1024 * 1024 * 1024));
}
#[test]
fn parse_memory_limit_si_decimal_suffixes() {
assert_eq!(parse_memory_limit("1MB"), Some(1_000_000));
assert_eq!(parse_memory_limit("1GB"), Some(1_000_000_000));
}
#[test]
fn parse_memory_limit_raw_bytes() {
assert_eq!(parse_memory_limit("134217728"), Some(134_217_728));
assert_eq!(parse_memory_limit(" 134217728 "), Some(134_217_728));
}
#[test]
fn parse_memory_limit_invalid_returns_none() {
// Regression guard: the old implementation returned Some(0) for "128Mi"
// because lowercase+trim_end_matches('m') left "128i" which parse::<f64>
// rejected. The new implementation must never return Some(0) or Some of
// a negative number from any input.
assert_eq!(parse_memory_limit(""), None);
assert_eq!(parse_memory_limit(" "), None);
assert_eq!(parse_memory_limit("abc"), None);
assert_eq!(parse_memory_limit("0"), None);
assert_eq!(parse_memory_limit("0Mi"), None);
assert_eq!(parse_memory_limit("-1Mi"), None);
}
#[test]
fn parse_memory_limit_tolerates_whitespace_and_fractional() {
assert_eq!(
parse_memory_limit(" 1.5Gi "),
Some((1.5 * (1024.0 * 1024.0 * 1024.0)) as i64)
);
}
}

View File

@@ -1,4 +1,4 @@
use crate::manifest::AppManifest;
use crate::manifest::{AppManifest, BuildConfig};
use crate::podman_client::{ContainerState, ContainerStatus, PodmanClient};
use anyhow::{Context, Result};
use async_trait::async_trait;
@@ -20,6 +20,22 @@ pub trait ContainerRuntime: Send + Sync {
async fn get_container_status(&self, name: &str) -> Result<ContainerStatus>;
async fn get_container_logs(&self, name: &str, lines: u32) -> Result<Vec<String>>;
async fn list_containers(&self) -> Result<Vec<ContainerStatus>>;
/// Check whether an image reference exists in local storage.
///
/// The reconciler calls this before deciding to build. `true` means
/// `image inspect <image_ref>` succeeded (or equivalent); `false` means
/// the image is not present. Registry/network state is explicitly NOT
/// consulted — this is a local-storage check only.
async fn image_exists(&self, image_ref: &str) -> Result<bool>;
/// Build a local image from a `BuildConfig`.
///
/// Equivalent to `podman build -t <tag> -f <dockerfile> [--build-arg K=V ...] <context>`.
/// The resulting image is referenceable by `config.tag` for subsequent
/// `create_container` / `image_exists` calls. Stdout/stderr are collected
/// and included in the error on failure; on success they are discarded.
async fn build_image(&self, config: &BuildConfig) -> Result<()>;
}
pub struct PodmanRuntime {
@@ -32,6 +48,17 @@ impl PodmanRuntime {
client: PodmanClient::new(user),
}
}
/// Run `podman <args>`, returning an error with captured stderr on non-zero
/// exit. Used for operations (build, image inspect) that are awkward over the
/// HTTP API. The daemon runs as the target user already, so no sudo hop.
async fn podman_cli(&self, args: &[&str]) -> Result<std::process::Output> {
let mut cmd = TokioCommand::new("podman");
cmd.args(args);
cmd.output()
.await
.with_context(|| format!("failed to execute podman {}", args.join(" ")))
}
}
#[async_trait]
@@ -79,6 +106,68 @@ impl ContainerRuntime for PodmanRuntime {
async fn list_containers(&self) -> Result<Vec<ContainerStatus>> {
self.client.list_containers().await
}
async fn image_exists(&self, image_ref: &str) -> Result<bool> {
// `podman image exists` returns 0 if present, 1 if absent. Any other
// exit code is an environment failure we should surface.
let output = self.podman_cli(&["image", "exists", image_ref]).await?;
match output.status.code() {
Some(0) => Ok(true),
Some(1) => Ok(false),
Some(code) => {
let stderr = String::from_utf8_lossy(&output.stderr);
Err(anyhow::anyhow!(
"podman image exists {image_ref} exited with {code}: {stderr}"
))
}
None => Err(anyhow::anyhow!(
"podman image exists {image_ref} terminated by signal"
)),
}
}
async fn build_image(&self, config: &BuildConfig) -> Result<()> {
let args = build_args_for_podman(config);
let borrowed: Vec<&str> = args.iter().map(|s| s.as_str()).collect();
let output = self.podman_cli(&borrowed).await?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
let stdout = String::from_utf8_lossy(&output.stdout);
return Err(anyhow::anyhow!(
"podman build -t {} failed: {stderr}{}{stdout}",
config.tag,
if stderr.is_empty() || stdout.is_empty() {
""
} else {
"\n---stdout---\n"
}
));
}
Ok(())
}
}
/// Build the argv for `podman build` from a BuildConfig.
///
/// Extracted so it can be unit-tested without actually invoking podman.
/// Order is fixed for deterministic tests: subcommand, -t, -f, build-args
/// (sorted by key), context.
fn build_args_for_podman(config: &BuildConfig) -> Vec<String> {
let mut args: Vec<String> = vec![
"build".to_string(),
"-t".to_string(),
config.tag.clone(),
"-f".to_string(),
config.dockerfile.clone(),
];
let mut kv: Vec<(&String, &String)> = config.build_args.iter().collect();
kv.sort_by(|a, b| a.0.cmp(b.0));
for (k, v) in kv {
args.push("--build-arg".to_string());
args.push(format!("{k}={v}"));
}
args.push(config.context.clone());
args
}
pub struct DockerRuntime {
@@ -188,7 +277,13 @@ impl ContainerRuntime for DockerRuntime {
cmd.arg("--cap-add").arg(cap);
}
cmd.arg(&manifest.app.container.image);
let image_ref = manifest.app.container.image_ref().ok_or_else(|| {
anyhow::anyhow!(
"container config for {} has neither a valid image nor build source",
manifest.app.id
)
})?;
cmd.arg(&image_ref);
let output = cmd.output().await.context("Failed to create container")?;
@@ -344,6 +439,55 @@ impl ContainerRuntime for DockerRuntime {
Ok(result)
}
async fn image_exists(&self, image_ref: &str) -> Result<bool> {
// `docker image inspect` exits 1 when the image is absent. Any message
// to stderr in that case is informational; we swallow it.
let mut cmd = self.docker_async();
cmd.arg("image").arg("inspect").arg(image_ref);
let output = cmd
.output()
.await
.context("failed to execute docker image inspect")?;
match output.status.code() {
Some(0) => Ok(true),
Some(1) => Ok(false),
Some(code) => {
let stderr = String::from_utf8_lossy(&output.stderr);
Err(anyhow::anyhow!(
"docker image inspect {image_ref} exited with {code}: {stderr}"
))
}
None => Err(anyhow::anyhow!(
"docker image inspect {image_ref} terminated by signal"
)),
}
}
async fn build_image(&self, config: &BuildConfig) -> Result<()> {
let mut cmd = self.docker_async();
cmd.arg("build")
.arg("-t")
.arg(&config.tag)
.arg("-f")
.arg(&config.dockerfile);
for (k, v) in &config.build_args {
cmd.arg("--build-arg").arg(format!("{k}={v}"));
}
cmd.arg(&config.context);
let output = cmd
.output()
.await
.context("failed to execute docker build")?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(anyhow::anyhow!(
"docker build -t {} failed: {stderr}",
config.tag
));
}
Ok(())
}
}
pub struct AutoRuntime {
@@ -415,7 +559,91 @@ impl ContainerRuntime for AutoRuntime {
async fn list_containers(&self) -> Result<Vec<ContainerStatus>> {
self.runtime.list_containers().await
}
async fn image_exists(&self, image_ref: &str) -> Result<bool> {
self.runtime.image_exists(image_ref).await
}
async fn build_image(&self, config: &BuildConfig) -> Result<()> {
self.runtime.build_image(config).await
}
}
// Runtime factory functions will be provided by the archipelago crate
// that imports this library and has access to Config
#[cfg(test)]
mod tests {
use super::*;
use std::collections::HashMap;
fn cfg(context: &str, tag: &str, dockerfile: &str, args: &[(&str, &str)]) -> BuildConfig {
BuildConfig {
context: context.to_string(),
dockerfile: dockerfile.to_string(),
tag: tag.to_string(),
build_args: args
.iter()
.map(|(k, v)| (k.to_string(), v.to_string()))
.collect::<HashMap<_, _>>(),
}
}
#[test]
fn build_args_minimal() {
let c = cfg("/tmp/ctx", "archy-bitcoin-ui:local", "Dockerfile", &[]);
assert_eq!(
build_args_for_podman(&c),
vec![
"build",
"-t",
"archy-bitcoin-ui:local",
"-f",
"Dockerfile",
"/tmp/ctx",
]
);
}
#[test]
fn build_args_custom_dockerfile() {
let c = cfg("/opt/archy/bitcoin-ui", "x:local", "Dockerfile.prod", &[]);
let got = build_args_for_podman(&c);
assert_eq!(got[3], "-f");
assert_eq!(got[4], "Dockerfile.prod");
assert_eq!(got.last().unwrap(), "/opt/archy/bitcoin-ui");
}
#[test]
fn build_args_are_sorted_deterministically() {
// HashMap iteration order is nondeterministic; the runtime sorts so that
// equivalent BuildConfigs produce identical commands (easier to debug,
// cache-friendly if we ever layer build-cache keys on top).
let c = cfg(
"/c",
"t",
"Dockerfile",
&[("BAR", "2"), ("FOO", "1"), ("BAZ", "3")],
);
let args = build_args_for_podman(&c);
let flat: Vec<&str> = args.iter().map(|s| s.as_str()).collect();
// Build args appear as pairs of --build-arg K=V; locate them:
let mut pairs: Vec<&str> = Vec::new();
for w in flat.windows(2) {
if w[0] == "--build-arg" {
pairs.push(w[1]);
}
}
assert_eq!(pairs, vec!["BAR=2", "BAZ=3", "FOO=1"]);
}
#[test]
fn build_args_context_is_last() {
// Context MUST be the final positional argument — podman treats any
// stray trailing arg after build-args as the context, so placement
// matters. Regression guard.
let c = cfg("/final/context", "t", "Dockerfile", &[("K", "V")]);
let args = build_args_for_podman(&c);
assert_eq!(args.last().unwrap(), "/final/context");
}
}

View File

@@ -1,9 +1,22 @@
FROM git.tx1138.com/lfg2025/nginx:1.27.4-alpine
# Static site content.
COPY index.html /usr/share/nginx/html/
COPY 50x.html /usr/share/nginx/html/
COPY assets/ /usr/share/nginx/html/assets/
COPY nginx.conf /etc/nginx/conf.d/default.conf
# Run nginx as root to avoid chown failures in rootless Podman user namespaces
#
# NOTE: /etc/nginx/conf.d/default.conf is intentionally NOT copied from
# this build context. It is bind-mounted at container-create time from
# /var/lib/archipelago/bitcoin-ui/nginx.conf on the host, which the
# archipelago prod orchestrator renders with the current base64 RPC
# auth substituted in (see core/archipelago/src/container/bitcoin_ui.rs).
#
# If the bind-mount fails nginx will start with no site configured and
# return 404 on every request. That's the intended safe failure mode —
# better than baking a placeholder into the image and potentially
# serving the upstream RPC proxy with a stale/empty Authorization header.
#
# Run nginx as root to avoid chown failures in rootless Podman user
# namespaces. The rest of the nginx image is unchanged.
RUN sed -i 's/^user nginx;/user root;/' /etc/nginx/nginx.conf && \
mkdir -p /var/cache/nginx/client_temp /var/cache/nginx/proxy_temp \
/var/cache/nginx/fastcgi_temp /var/cache/nginx/uwsgi_temp \

View File

@@ -22,6 +22,6 @@ RUN sed -i 's/^user nginx;/user root;/' /etc/nginx/nginx.conf && \
mkdir -p /var/cache/nginx/client_temp /var/cache/nginx/proxy_temp \
/var/cache/nginx/fastcgi_temp /var/cache/nginx/uwsgi_temp \
/var/cache/nginx/scgi_temp
EXPOSE 8080
EXPOSE 80
ENTRYPOINT []
CMD ["nginx", "-g", "daemon off;"]

View File

@@ -1,5 +1,5 @@
server {
listen 8081;
listen 80;
server_name _;
root /usr/share/nginx/html;

Some files were not shown because too many files have changed in this diff Show More