260414 Hermes provider/key 체인 정리

작성 시각: 2026-04-14 KST
범위: ~/.hermes/*만 확인/수정. ~/.openclaw/*, shared/llm.py 수정 없음.

1. 결론

항목	조치/판정
OpenRouter	`OPENROUTER_API_KEY` 없음. 운영 체인에서 제거됨. 명시 요청 시에는 여전히 빈 key 상태로 확인 가능하나 기본/폴백 체인에서는 빠짐
Primary provider	`github-copilot` → `minimax`로 변경. Copilot 12/day quota 압박 제거 목적
Fallback	`ollama/qwen2.5:3b` 유지 + `fallback_providers` 명시 추가
Anthropic	직접 Anthropic key/OAuth 없음. Anthropic provider는 비활성으로 두고, 현재는 MiniMax의 Anthropic-compatible endpoint만 사용
Ollama alias	`runtime_provider.py`에서 `ollama`가 OpenRouter로 떨어지는 경로 차단. 이제 `ollama`는 `http://localhost:11434/v1` local custom으로 해석
테스트	Hermes CLI 실제 LLM 호출 1회 성공: `OK` 응답

주의: 실행 중인 18789 Hermes gateway 프로세스는 재시작하지 않음. 이번 변경은 디스크 반영 + 새 hermes_cli.main 프로세스에서 검증됨. 운영 gateway에 반영하려면 별도 승인 후 controlled restart 필요.

2. 확인한 현재 key/provider 상태

~/.hermes/.env, ~/.hermes/config.yaml, ~/.hermes/auth.json 확인 결과:

Provider	key/auth 상태	판정
OpenRouter	없음	기본 체인 사용 금지
OpenAI/OpenRouter env	없음	자동 OpenRouter 경로 금지
Anthropic direct	없음	직접 Anthropic 비활성
OpenAI Codex OAuth	auth store 흔적은 있으나 refresh 401	현재 안정 주 provider로 부적합
GitHub Copilot	`gh auth token`으로 해석 가능	12/day quota 때문에 주 provider에서 제외
MiniMax	key 있음	현재 유일한 외부 안정 후보
Ollama	local endpoint	최종 무료 fallback

실측:

openai-codex: Codex token refresh failed with status 401
openrouter: api_key empty
anthropic: No Anthropic credentials found
minimax: key present, api_mode=anthropic_messages
ollama: local runtime → http://localhost:11434/v1

3. 수정 내용

3.1 `~/.hermes/config.yaml`

변경 전:

model:
  default: gpt-5-mini
  provider: github-copilot
fallback_model:
  provider: ollama
  model: qwen2.5:3b

변경 후:

model:
  default: MiniMax-M2.7-highspeed
  provider: minimax
  api_mode: anthropic_messages
fallback_model:
  provider: ollama
  model: qwen2.5:3b
  base_url: http://localhost:11434/v1
fallback_providers:
- provider: ollama
  model: qwen2.5:3b
  base_url: http://localhost:11434/v1
compression:
  summary_provider: minimax
  summary_model: MiniMax-M2.7-highspeed

의도:

Copilot quota 소진 방지
OpenRouter no-key 경로 제거
compression/summary가 OpenRouter auto로 빠지지 않게 MiniMax 고정
Ollama fallback을 legacy fallback_model과 새 fallback_providers 양쪽에 명시

백업:

~/.hermes/config.yaml.bak-20260414-provider-chain

3.2 `~/.hermes/hermes-agent/hermes_cli/runtime_provider.py`

수정:

local provider alias 목록 추가
requested=ollama일 때 OpenRouter로 fall-through하지 않고 local OpenAI-compatible endpoint로 직접 해석
OpenRouter runtime의 api_mode가 현재 model config의 api_mode를 잘못 상속하지 않도록 provider family 확인 추가

핵심 효과:

before: requested=ollama → custom → no custom endpoint → openrouter(no key)
after:  requested=ollama → custom(local:ollama) → http://localhost:11434/v1

백업:

~/.hermes/hermes-agent/hermes_cli/runtime_provider.py.bak-20260414-provider-chain

4. provider 우선순위 정리 후 상태

새 프로세스 기준 runtime 해석:

요청	결과
default / None	`minimax`, `anthropic_messages`, `https://api.minimax.io/anthropic`
`minimax`	same
`ollama`	`custom`, `chat_completions`, `http://localhost:11434/v1`
`openrouter`	`openrouter`, key empty, 기본 체인에서는 미사용
`anthropic`	credentials 없음으로 실패
`github-copilot`	token 해석 가능하나 기본 체인에서는 미사용

운영 체인:

Tier 0: MiniMax-M2.7-highspeed via MiniMax Anthropic-compatible API
Tier 1: Ollama qwen2.5:3b via localhost:11434/v1
비활성: OpenRouter(no key), Anthropic direct(no key/OAuth), OpenAI Codex(refresh 401), Copilot(quota pressure)

5. Copilot quota 모니터 설계

오늘 OpenClaw LLM 로그 기준:

github-copilot calls: 8534
ok: 6
429: 44
first 429: 2026-04-14T10:11:31+09:00
last 429:  2026-04-14T14:27:30+09:00

권고 설계:

매 15분 ~/.openclaw/logs/llm/YYYYMMDD.jsonl 파싱
github-copilot/* 중 HTTP Error 429, RateLimitReached, 12 per 86400s 카운트
첫 daily quota 429 감지 시 ~/.hermes/state/provider_cooldown.json 같은 상태 파일에 다음 reset timestamp 저장
reset 전까지 Hermes/OpenClaw provider chain에서 Copilot skip
일일 10회 성공 또는 첫 429 시 알림센터 INFO 발송

이번 작업에서는 cron 등록하지 않음. 현재 config에서 Hermes primary를 MiniMax로 바꿔 Copilot 신규 소모를 먼저 차단함.

6. 테스트 결과

6.1 resolver / config 검증

None     → minimax / anthropic_messages / key present
minimax  → minimax / anthropic_messages / key present
ollama   → custom / chat_completions / http://localhost:11434/v1
openrouter → openrouter / chat_completions / key empty
anthropic → credentials 없음

6.2 unit tests

python -m pytest -q tests/test_runtime_provider_resolution.py tests/test_fallback_model.py tests/test_provider_fallback.py
89 passed in 63.53s

6.3 실제 Hermes CLI LLM 호출

명령:

python -m hermes_cli.main chat -q '한국어로 OK 한 단어만 답해.' --provider minimax -m MiniMax-M2.7-highspeed -Q --max-turns 1 --source tool

결과:

OK
session_id: 20260414_143454_eacaad

7. 남은 리스크

18789 gateway running process는 재시작하지 않아 아직 메모리상 old config/code일 수 있음
OpenAI Codex OAuth는 refresh 401이라 별도 재로그인 필요
직접 Anthropic은 key/OAuth 없음. 사용할 거면 해리가 key/OAuth를 제공해야 함
MiniMax는 외부 API key 기반이라 비용/쿼터 모니터가 별도로 필요
hermes chat --provider ollama CLI enum에는 아직 ollama가 없어 직접 CLI provider로는 선택 불가. fallback path와 runtime resolver는 동작 확인됨

8. 자체 평가

기준	점수	근거
정확성	4.7	실제 config/key/runtime/CLI 호출 확인, OpenClaw/shared 미수정
완성도	4.6	provider 체인 정리·fallback 명시·테스트 완료. gateway 재시작은 승인 필요로 남김
검증	4.8	resolver, py_compile, 89개 unit test, 실제 LLM 1회 호출 통과
최소 변경	4.8	Hermes runtime + Hermes config만 변경. OpenClaw 미수정

종합: 4.72/5

260414 Hermes provider/key 체인 정리

260414 Hermes provider/key 체인 정리

1. 결론

2. 확인한 현재 key/provider 상태

3. 수정 내용

3.1 ~/.hermes/config.yaml

3.2 ~/.hermes/hermes-agent/hermes_cli/runtime_provider.py

4. provider 우선순위 정리 후 상태

5. Copilot quota 모니터 설계

6. 테스트 결과

6.1 resolver / config 검증

6.2 unit tests

6.3 실제 Hermes CLI LLM 호출

7. 남은 리스크

8. 자체 평가

3.1 `~/.hermes/config.yaml`

3.2 `~/.hermes/hermes-agent/hermes_cli/runtime_provider.py`