← 리포트 목록
Hermes Gateway launchd bootstrap 안정화
2026-04-15
hermes
[phase17-followup, launchd, gateway, hermes, stability]
결론
~/Library/LaunchAgents/ai.hermes.gateway.plist는 존재했다.- 기존 active plist의 핵심 문제는 다음 2개다.
API_SERVER_ENABLED/HOST/PORT가 빠져 있어 launchd cold start 시 18789 API server를 보장하지 못함.ProgramArguments에--replace가 들어가 있어 launchd가 own lifecycle을 관리하는 방식과 충돌 가능성이 있음.- plist는 정상화했다.
API_SERVER_ENABLED=trueAPI_SERVER_HOST=127.0.0.1API_SERVER_PORT=18789--replace제거RunAtLoad=trueKeepAlive=true- stdout/stderr를
gateway-launchd.log/gateway-launchd.error.log로 분리 - 그러나 실제 bootstrap cutover는 완료하지 못했다.
- 현재 foreground PID
41269가 18789를 점유 중. - sandbox에서
kill 41269가Operation not permitted로 실패. - port 점유 상태에서
launchctl bootstrap은Bootstrap failed: 5: Input/output error. - 따라서 현재 18789는 여전히 foreground PID
41269가 제공 중이고, launchd service는 현재 domain에 loaded 상태가 아니다.
즉, plist 정상화는 완료 / 실제 launchd 전환은 권한 때문에 미완료다.
기존 plist 발견 여부
발견 위치:
/Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist- 백업들:
/Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak/Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak-20260414-api-server/Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak_bot_migration_20260413
시스템 위치에는 Hermes gateway plist를 찾지 못했다.
=== 2026-04-15 12:25:59 KST initial discovery ===
--- LaunchAgents candidates ---
-rw-r--r--@ 1 ron staff 1443 Apr 15 04:09 /Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist
-rw-r--r--@ 1 ron staff 1608 Apr 13 14:24 /Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak
-rw-r--r--@ 1 ron staff 1443 Apr 14 10:22 /Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak-20260414-api-server
-rw-r--r--@ 1 ron staff 1608 Apr 13 17:07 /Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak_bot_migration_20260413
-rw-r--r--@ 1 ron staff 768 Apr 12 14:10 /Users/ron/Library/LaunchAgents/com.openclaw.hermes-tailer.plist
-rw-r--r--@ 1 ron staff 922 Apr 12 19:41 /Users/ron/Library/LaunchAgents/com.openclaw.hermes-upgrade-v08.plist
기존 실패 원인
확인된 기존 plist 상태:
"EnvironmentVariables" => {
"HERMES_HOME" => "/Users/ron/.hermes"
"PATH" => "/Users/ron/.hermes/hermes-agent/venv/bin:..."
"VIRTUAL_ENV" => "/Users/ron/.hermes/hermes-agent/venv"
}
"ProgramArguments" => [
"/Users/ron/.hermes/hermes-agent/venv/bin/python",
"-m",
"hermes_cli.main",
"gateway",
"run",
"--replace"
]
문제 판정:
| 후보 | 판정 | 근거 |
|---|---|---|
| env var 누락 | 해당 | active plist에 API_SERVER_ENABLED/HOST/PORT 없음. |
| 경로 오류 | 아님 | venv python과 working dir는 존재. |
| plist 문법 | 아님 | plutil -p 가능했고 새 plist도 plutil -lint OK. |
| 권한 | 부분 해당 | sandbox에서 kill, curl localhost, log show 제한. |
| 포트 충돌 | 해당 | foreground PID 41269가 18789 LISTEN 중이라 bootstrap cutover 실패 가능성이 큼. |
--replace lifecycle 충돌 |
해당 가능성 높음 | launchd 상태는 last exit code = 0, state = not running. long-running daemon이 정상 종료된 것으로 보임. |
launchd 상태 원문:
gui/501/ai.hermes.gateway = {
active count = 0
path = /Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist
type = LaunchAgent
state = not running
program = /Users/ron/.hermes/hermes-agent/venv/bin/python
arguments = {
/Users/ron/.hermes/hermes-agent/venv/bin/python
-m
hermes_cli.main
gateway
run
--replace
}
runs = 2
last exit code = 0
}
현재 foreground 상태:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
Python 41269 ron 21u IPv4 0x49eb4a93c543c1be 0t0 TCP 127.0.0.1:18789 (LISTEN)
새 plist 전문
백업:
/Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak-260415-launchd-20260415T122644
현재 plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>ai.hermes.gateway</string>
<key>ProgramArguments</key>
<array>
<string>/Users/ron/.hermes/hermes-agent/venv/bin/python</string>
<string>-m</string>
<string>hermes_cli.main</string>
<string>gateway</string>
<string>run</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/ron/.hermes/hermes-agent</string>
<key>EnvironmentVariables</key>
<dict>
<key>HERMES_HOME</key>
<string>/Users/ron/.hermes</string>
<key>API_SERVER_ENABLED</key>
<string>true</string>
<key>API_SERVER_HOST</key>
<string>127.0.0.1</string>
<key>API_SERVER_PORT</key>
<string>18789</string>
<key>PATH</key>
<string>/Users/ron/.hermes/hermes-agent/venv/bin:/Users/ron/.hermes/hermes-agent/node_modules/.bin:/opt/homebrew/Cellar/node@22/22.22.0/bin:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/ron/.local/bin</string>
<key>VIRTUAL_ENV</key>
<string>/Users/ron/.hermes/hermes-agent/venv</string>
<key>PYTHONUNBUFFERED</key>
<string>1</string>
</dict>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>ThrottleInterval</key>
<integer>10</integer>
<key>StandardOutPath</key>
<string>/Users/ron/.hermes/logs/gateway-launchd.log</string>
<key>StandardErrorPath</key>
<string>/Users/ron/.hermes/logs/gateway-launchd.error.log</string>
</dict>
</plist>
문법 검증:
=== 2026-04-15 12:26:44 KST write normalized plist ===
backup=/Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist.bak-260415-launchd-20260415T122644
/Users/ron/Library/LaunchAgents/ai.hermes.gateway.plist: OK
bootstrap 결과
cutover 순서:
- fallback command 준비
- stale launchd label bootout
- foreground PID 41269 종료 시도
- bootstrap 시도
- 18789 / health 확인
원문:
=== 2026-04-15 12:27:09 KST controlled cutover attempt ===
fallback command prepared: cd /Users/ron/.hermes/hermes-agent && HERMES_HOME=/Users/ron/.hermes API_SERVER_ENABLED=true API_SERVER_HOST=127.0.0.1 API_SERVER_PORT=18789 nohup /Users/ron/.hermes/hermes-agent/venv/bin/python -m hermes_cli.main gateway run --replace > /Users/ron/.hermes/logs/gateway-foreground-fallback.log 2>&1 &
current_18789_pid=41269
--- bootout stale launchd label (before foreground kill) ---
--- terminate foreground pid just before bootstrap ---
kill: 41269: Operation not permitted
listen after kill attempt:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
Python 41269 ron 21u IPv4 0x49eb4a93c543c1be 0t0 TCP 127.0.0.1:18789 (LISTEN)
--- bootstrap normalized plist ---
Bootstrap failed: 5: Input/output error
Try re-running the command as root for richer errors.
--- post-bootstrap state ---
Bad request.
Could not find service "ai.hermes.gateway" in domain for user gui: 501
--- launchctl list grep ---
--- listen 18789 ---
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
Python 41269 ron 21u IPv4 0x49eb4a93c543c1be 0t0 TCP 127.0.0.1:18789 (LISTEN)
--- health curl ---
curl: (7) Failed to connect to 127.0.0.1 port 18789 after 0 ms: Couldn't connect to server
해석:
bootout으로 stale launchd label은 빠졌다.- foreground PID 종료가 sandbox 권한으로 실패했다.
- 18789가 계속 점유된 상태에서 bootstrap을 시도해
Input/output error가 발생했다. - 새 launchd stdout/stderr 파일은 생성되지 않았다. 즉 gateway process까지 정상 spawn되지 못한 것으로 보인다.
--- gateway-launchd.log ---
tail: /Users/ron/.hermes/logs/gateway-launchd.log: No such file or directory
--- gateway-launchd.error.log ---
tail: /Users/ron/.hermes/logs/gateway-launchd.error.log: No such file or directory
health / cron status 검증
현재 18789 LISTEN은 foreground PID가 제공 중이다.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
Python 41269 ron 21u IPv4 0x49eb4a93c543c1be 0t0 TCP 127.0.0.1:18789 (LISTEN)
다만 sandbox에서 localhost curl은 실패했다.
curl: (7) Failed to connect to 127.0.0.1 port 18789 after 0 ms: Couldn't connect to server
hermes cron list도 launchd 등록 상태를 보지 못해 gateway not running 경고를 냈다.
⚠ Gateway is not running — jobs won't fire automatically.
Start it with: hermes gateway install
sudo hermes gateway install --system # Linux servers
주의: runtime status file에는 foreground PID 41269와 api_server connected 기록이 남아 있었다.
{
"pid": 41269,
"kind": "hermes-gateway",
"argv": ["/Users/ron/.local/bin/hermes", "gateway", "run"],
"gateway_state": "running",
"platforms": {
"telegram": {"state": "connected"},
"api_server": {"state": "connected"}
}
}
하지만 이 세션에서 os.kill(pid, 0) 기반 status check는 sandbox PermissionError에 걸릴 수 있어 CLI가 gateway running을 신뢰하지 못하는 상태다.
KeepAlive 자동 복구 검증
미완료.
이유:
- launchd bootstrap 자체가 완료되지 않았다.
- foreground PID를 sandbox에서 종료할 수 없어 launchd로 소유권을 넘길 수 없었다.
- 따라서
launchctl kill후 KeepAlive 재기동 테스트도 수행하지 않았다.
메인 세션에서 실행할 cutover 명령
현재 Codex sandbox에서는 PID 종료/localhost curl/상세 system log가 제한된다. 메인 세션에서 아래 순서로 실행하면 된다.
# 0) 현재 foreground PID 확인
lsof -nP -iTCP:18789 -sTCP:LISTEN
# 1) stale label이 있으면 제거
launchctl bootout gui/$(id -u)/ai.hermes.gateway 2>/dev/null || true
# 2) foreground gateway 종료
kill -TERM 41269
sleep 3
lsof -nP -iTCP:18789 -sTCP:LISTEN || true
# 3) normalized plist bootstrap
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/ai.hermes.gateway.plist
sleep 5
# 4) 검증
launchctl list | grep hermes
launchctl print gui/$(id -u)/ai.hermes.gateway | egrep 'state =|pid =|last exit code|runs ='
lsof -nP -iTCP:18789 -sTCP:LISTEN
curl -sS http://127.0.0.1:18789/v1/health
hermes cron list | tail -20
# 5) KeepAlive 검증
launchctl kill TERM gui/$(id -u)/ai.hermes.gateway
sleep 12
launchctl print gui/$(id -u)/ai.hermes.gateway | egrep 'state =|pid =|runs =|last terminating signal|last exit code'
lsof -nP -iTCP:18789 -sTCP:LISTEN
curl -sS http://127.0.0.1:18789/v1/health
실패 시 foreground fallback:
cd /Users/ron/.hermes/hermes-agent
HERMES_HOME=/Users/ron/.hermes \
API_SERVER_ENABLED=true \
API_SERVER_HOST=127.0.0.1 \
API_SERVER_PORT=18789 \
nohup /Users/ron/.hermes/hermes-agent/venv/bin/python -m hermes_cli.main gateway run --replace \
> /Users/ron/.hermes/logs/gateway-foreground-fallback.log 2>&1 &
현재 상태
- plist 파일: 정상화 완료.
- launchd service: 현재 domain에 미등록 상태.
bootout후bootstrap실패했기 때문. - 18789: foreground PID
41269가 계속 LISTEN 중. - 재부팅 안정성: 아직 미확정. 파일은
~/Library/LaunchAgents에 있으므로 다음 로그인 시 로드될 가능성은 있지만, 현재 세션에서 bootstrap 검증은 실패했다.
자체평가
- 정확성: 3.8/5 — plist 원인 분석과 정상화는 정확히 수행. 실제 bootstrap/KeepAlive 검증은 권한 때문에 미완료.
- 완성도: 3.6/5 — cutover 명령과 fallback까지 남겼지만 현재 launchd loaded 상태를 만들지는 못함.
- 검증: 3.5/5 — plutil, launchctl 상태, lsof, bootstrap 실패 원문은 확보. health/KeepAlive 성공 검증은 없음.
- 최소 변경: 4.7/5 — Hermes gateway plist만 수정. 다른 cron/service는 건드리지 않음.
종합: 3.9/5.