feat(systemd): delegate cgroup controllers to archipelago.service

Adds Delegate=memory pids cpu io to the archipelago.service unit.

Context: the service runs as User=archipelago under system.slice with
rootless podman. When podman creates transient libpod-*.scope units for
containers under user.slice, systemd needs the caller to hold
CAP_SYS_ADMIN on the target cgroup subtree \u2014 which happens iff
Delegate= lists the controllers we want to set. Without Delegate, any
future code path that goes through the podman CLI (runtime.rs) instead
of the libpod HTTP API (podman_client.rs) would hit MemoryMax
rejections that have exactly the same symptom as the bug I just fixed
in parse_memory_limit but with a completely different root cause.

Belt-and-braces: current production path uses PodmanClient and was
fixed in the preceding commit. But the DockerRuntime CLI path in
runtime.rs:262-268 (cmd.arg("--memory")) is still reachable via
AutoRuntime fallback on hosts without podman, and future rust
orchestrator code may legitimately need cgroup delegation. This
directive is no-op harmful on hosts that already delegate upstream
(systemd gracefully handles duplicate/nested delegation).
This commit is contained in:
archipelago
2026-04-23 03:44:36 -04:00
parent 732df1b8cb
commit ba83f9bce2

View File

@@ -48,6 +48,14 @@ MemoryMax=4G
LimitNOFILE=65535
TasksMax=2048
# Delegate cgroup controllers so rootless podman (run from this system service
# as user=archipelago, not user@1000.service) can create transient libpod-*.scope
# units with --memory / --cpus / --pids-limit. Without this, podman create fails
# at start time with: "MemoryMax is out of range" because systemd rejects resource
# limits on undelegated cgroup subtrees. Required for the ProdContainerOrchestrator
# code path (see core/archipelago/src/container/prod_orchestrator.rs).
Delegate=memory pids cpu io
# Logging
StandardOutput=journal
StandardError=journal