Files
archy/docs/RESUME.md

11 KiB

RESUME — Install UX polish round (v1.7.43-alpha)

Last updated: 2026-04-23

Read this first if you're a fresh OpenCode session resuming the install/uninstall/update UX work.


Where we are right now

v1.7.43-alpha shipped and deployed to .228. Latest addition: image-versions.sh path bug fixed (silent update-check failure on all production nodes). User is about to walk the marketplace app-by-app on .228 to shake out any remaining broken apps. Tracker for that walk: docs/MARKETPLACE-QA.md.

Commits on .116:main (newest first, unpushed per user mirror protocol):

  • a9908597 fix(image-versions): locate image-versions.sh at its actual deployed path
  • 013e8df0 docs(resume): add RESUME.md for context-restart recovery
  • f9fef8d2 docs(status): record rounds 3-5 + config migration + changelog as shipped
  • 008da477 docs(changelog): add v1.7.43-alpha entry covering async lifecycle + .23 retirement
  • 0ee16820 fix(config): auto-purge decommissioned .23 VPS from saved registry/mirror configs
  • 22052325 chore: retire .23 VPS mirror, promote .168 OVH to primary
  • f86d86c3 fix(install): kick scanner post-install so Launch button appears immediately
  • 8cc84ebc feat(install): phase-based progress bar replaces unparseable pull bytes
  • 2d5b859e feat(rpc): async-spawn install/uninstall/update lifecycle (Round 2)
  • 0733ac40 fix(ui): shorten install/uninstall/update timeouts for async RPCs (Round 2)
  • e471ef75 fix(rpc): empty icon in transient install entry (Round 2)

Deployed artifacts on .228:

  • Backend: /usr/local/bin/archipelago md5 9b8ead06aaf210b85cd78fce270384e3 (includes image-versions path fix)
  • Frontend: /opt/archipelago/web-ui/ (v1.7.43-alpha changelog with 5 bullets, .168-only registry)
  • Rollback backups: /usr/local/bin/archipelago.bak-pre-async-install + /opt/archipelago/web-ui.bak-pre-async-install/

Rollback command (if catastrophic):

ssh archy228 'sudo cp -a /usr/local/bin/archipelago.bak-pre-async-install /usr/local/bin/archipelago && sudo rsync -a --delete /opt/archipelago/web-ui.bak-pre-async-install/ /opt/archipelago/web-ui/ && sudo systemctl restart archipelago && sudo systemctl reload nginx'

Immediate next step

Phase 1 — browser verification of v1.7.43-alpha on https://192.168.1.228/

  1. Settings → About: top changelog entry reads "v1.7.43-alpha · Apr 23, 2026" with 5 bullets. Last bullet mentions "Update-available badges and version comparisons work again across every app." Hard-refresh (Cmd+Shift+R) if stale.
  2. Settings → App Registries: only 146.59.87.168:3000/lfg2025 + git.tx1138.com. No .23.
  3. Settings → System Update → Update Mirrors: only .168 (Server 1 primary) + tx1138 (Server 2). No .23.
  4. Install SearXNG (small, fast image). Expect: instant button response, 7 phase labels in progress bar (Preparing → Pulling image → Creating container → Starting container → Waiting for healthy → Finalizing → Done), Launch button appears within ~3s of "Done".
  5. Uninstall: snappy, no freeze.

Phase 2 — marketplace walk (app-by-app on .228)

Once Phase 1 is clean, user will install every app in the marketplace catalog one by one. Tracker: docs/MARKETPLACE-QA.md. For each broken app:

  • Triage via journalctl -u archipelago, podman ps -a, podman logs <name>.
  • Identify layer: app recipe / registry image / backend / frontend.
  • Fix, commit fix(app/<name>): ... or similar.
  • Redeploy as needed.
  • Append release-note bullet for the fix (to current in-flight version, or bump to v1.7.44-alpha if the pile grows).
  • User re-verifies, mark in the tracker.

Known pre-existing issue to expect: Vaultwarden container exits immediately on start. Backend correctly detects + removes state entry; needs container-config debug.


Overall mission (unchanged)

User mandate: "best server containers in the world". Polish install/uninstall/update flows for all 6 bundled server containers + marketplace apps before release. Tackle UX issues one by one in order.


Working layout — SSH + SSHFS

  • SSHFS mount: /Users/dorian/mnt/archy-thinkpad/archy:Projects/archy/. Use for all file ops (read/edit/write/glob/grep).
  • Direct SSH: ssh archy (= archipelago@192.168.1.116, ThinkPad dev). Use for git/cargo/npm/systemctl.
  • Demo node: ssh archy228 (= archipelago@192.168.1.228). NOPASSWD sudo. Dashboard login pw: password123.
  • Sudo pw on .116: ThisIsWeb54321@. Fallback sudo pw on .228: archipelago.
  • Cargo: ~/.cargo/bin/cargo on .116. Long builds: nohup ... & disown to /tmp/cargo-build-*.log.
  • SSHFS flake: write sometimes returns NotFound: FileSystem.readFile on new files — retry once.

Deploy recipes

Backend binary (can't cp while running — "Text file busy"; binary ferries via Mac because .116 can't resolve archy228):

# On .116:
~/.cargo/bin/cargo build --release   # ~3.5 min
# From Mac:
scp archy:Projects/archy/core/target/release/archipelago /tmp/archipelago-new
scp /tmp/archipelago-new archy228:/tmp/archipelago-new
ssh archy228 'sudo systemctl stop archipelago && sudo cp /tmp/archipelago-new /usr/local/bin/archipelago && sudo systemctl start archipelago && sudo systemctl reload nginx'

Frontend (rsyncs via Mac):

ssh archy 'cd ~/Projects/archy/neode-ui && npm run build'   # outputs to ../web/dist/neode-ui/
rsync -az --delete archy:Projects/archy/web/dist/neode-ui/ /tmp/archy-web/
rsync -az /tmp/archy-web/ archy228:/tmp/archy-web/
ssh archy228 'sudo rsync -a --delete /tmp/archy-web/ /opt/archipelago/web-ui/ && sudo systemctl reload nginx'

Note: frontend source is neode-ui/ (has package.json). web/ has no package.json; web/dist/neode-ui/ is the build output.

Commit protocol

  • Never push. User mirrors to Gitea remotes manually.
  • Conventional Commits. No em-dashes or fancy punctuation.
  • Multi-line messages via tmp-commit-msg.txt: git commit -F tmp-commit-msg.txt && rm tmp-commit-msg.txt.
  • Git remotes on .116: gitea-local, gitea-vps2 (.168 OVH), tx1138 (canonical), origin (multi-push alias). .23 URLs were removed from origin and gitea-vps remote was deleted — working-copy change, not in any commit.

Verification gates

  1. cargo check
  2. cargo test -p archipelago --bin archipelago <filter> (MUST use --bin archipelago; no lib target)
  3. cargo build --release

Known issues:

  • rust-lld: undefined hidden symbol → cargo bug with test+release incremental collision. Fix: rm -rf core/target/debug/incremental and retry.
  • 22 pre-existing cargo test failures in unrelated modules (mesh/wallet/credentials/avatar/session/transport/update-mirrors/fips/identity_manager/image_versions). Not blocking. Tech debt.

Architecture — locked-in patterns

Async-spawn lifecycle (install/update/uninstall/start/stop/restart)

  • RPC returns {status, package_id} immediately (15s client timeout).
  • Wrapper flips state to transitional variant (Installing/Updating/Removing/Stopping/Starting/Restarting) BEFORE spawn.
  • tokio::spawn runs existing monolithic inner handler with self: Arc<Self>.
  • Install/update success: MUST explicitly write terminal Running state. merge_preserving_transitional in server.rs refuses to let scanner overwrite transitional states.
  • Uninstall success: inner handler removes the entry itself.
  • On error: revert pre-transition state (or remove entry for install).

Key files:

  • core/archipelago/src/api/rpc/package/async_lifecycle.rs — full install/update/uninstall wrappers
  • core/archipelago/src/api/rpc/transitional.rs — start/stop/restart wrappers
  • core/archipelago/src/server.rs:832-871merge_preserving_transitional, is_transitional
  • core/archipelago/src/server.rs:295-380 — scan loop with tokio::select! and tick bump

Install progress (phase-based, 7 levels)

  • podman pull emits zero parseable progress when stderr is piped (no TTY). Legacy byte regex never matched.
  • Phases + UI %: Preparing (5) → PullingImage (20) → CreatingContainer (70) → StartingContainer (80) → WaitingHealthy (88) → PostInstall (95) → Done (100).
  • UI bar only advances forward (Math.max).
  • Final phase label is "Finalizing…" (renamed from "Running post-install…" which confused users).

Key files:

  • neode-ui/src/stores/server.ts:25-33PHASE_INFO mapper

Scanner kick (instant Launch button)

  • Scan runs every 60s. Post-install state flipped to Running but skeletal manifest (interfaces: None) persisted until next scan → canLaunch(pkg) false for up to 60s.
  • lan_address derived from live container port bindings. manifest.interfaces.main.ui only populated when lan_address.is_some() || tor_address.is_some().
  • Fix: scan_kick: Arc<Notify> + scan_tick: Arc<watch::Sender<u64>> on RpcHandler. Scan loop tokio::select! between 60s tick + notify. kick_scanner_and_wait helper (2s timeout) called in install/update success paths BEFORE writing Running. Merge during Installing keeps state + takes fresh manifest.

Key files:

  • core/archipelago/src/api/rpc/mod.rs:89-93 — fields on RpcHandler; accessors :186-199
  • core/archipelago/src/api/rpc/package/async_lifecycle.rs:405-430kick_scanner_and_wait
  • core/archipelago/src/container/docker_packages.rs:132-218 — where lan_address + manifest get populated
  • neode-ui/src/views/apps/appsConfig.ts:106-111canLaunch(pkg)
  • neode-ui/src/views/apps/AppCard.vue:141-149 — Launch button render

Config migration (.23 auto-purge)

  • load_mirrors + load_registries normally only ADD missing defaults ("explicit removals stick").
  • .23 was a default the user never chose, so we need the opposite: strip it.
  • .retain(|m| !m.url.contains("23.182.128.160")) before defaults-merge step. Narrow-scope exception, commented in-code.
  • Triggers lazily on next load (install RPC, update RPC, Settings UI open). Not tied to boot.

Key files:

  • core/archipelago/src/container/registry.rsload_registries
  • core/archipelago/src/update.rsload_mirrors

Backlog — after v1.7.43 verification

  1. User reports browser-verification results. Fix anything that fails.
  2. Continue user's "one by one" install/uninstall/update UX queue — ask for next issue.
  3. Tech debt (low priority, not blocking release):
    • Vaultwarden container exits immediately on start (separate container-config issue).
    • 22 pre-existing cargo test failures in unrelated modules.
    • "Server 3 (OVH)" historical changelog entries in AccountInfoSection.vue left intact (user approved — they're release notes for what shipped at the time).

User preferences (must follow)

  • Always state which option is "best long-term" first and explain why in plain terms. Trust my recommendation unless overridden.
  • "Tackle them one by one in order" — fix issues sequentially, not in a big bang.
  • Bitcoin-only. No altcoins, no proprietary deps without approval.
  • Prefer established OSS, crypto-first libs (rustls, argon2, ed25519), privacy-focused (no telemetry), minimal dep trees.
  • Atomic commits.
  • Never commit secrets. Pin dependency versions.
  • Never push — user mirrors to Gitea manually.