feat(systemd): delegate cgroup controllers to archipelago.service

Adds Delegate=memory pids cpu io to the archipelago.service unit. Context: the service runs as User=archipelago under system.slice with rootless podman. When podman creates transient libpod-*.scope units for containers under user.slice, systemd needs the caller to hold CAP_SYS_ADMIN on the target cgroup subtree \u2014 which happens iff Delegate= lists the controllers we want to set. Without Delegate, any future code path that goes through the podman CLI (runtime.rs) instead of the libpod HTTP API (podman_client.rs) would hit MemoryMax rejections that have exactly the same symptom as the bug I just fixed in parse_memory_limit but with a completely different root cause. Belt-and-braces: current production path uses PodmanClient and was fixed in the preceding commit. But the DockerRuntime CLI path in runtime.rs:262-268 (cmd.arg("--memory")) is still reachable via AutoRuntime fallback on hosts without podman, and future rust orchestrator code may legitimately need cgroup delegation. This directive is no-op harmful on hosts that already delegate upstream (systemd gracefully handles duplicate/nested delegation).
2026-04-23 03:44:36 -04:00
parent 732df1b8cb
commit ba83f9bce2
1 changed files with 8 additions and 0 deletions
--- a/image-recipe/configs/archipelago.service
+++ b/image-recipe/configs/archipelago.service
@@ -48,6 +48,14 @@ MemoryMax=4G
 LimitNOFILE=65535
 TasksMax=2048

+# Delegate cgroup controllers so rootless podman (run from this system service
+# as user=archipelago, not user@1000.service) can create transient libpod-*.scope
+# units with --memory / --cpus / --pids-limit. Without this, podman create fails
+# at start time with: "MemoryMax is out of range" because systemd rejects resource
+# limits on undelegated cgroup subtrees. Required for the ProdContainerOrchestrator
+# code path (see core/archipelago/src/container/prod_orchestrator.rs).
+Delegate=memory pids cpu io
+
 # Logging
 StandardOutput=journal
 StandardError=journal