article

Hermes’ Codex App-Server Runtime: What It Changes for Everyday Development

A practical, source-grounded guide to Hermes Agent’s optional Codex app-server runtime and how it changes coding workflows.

PublisherWayDigital

Published2026-05-20 04:26 UTC

Languageen

Regionglobal

CategoryProduct Notes

Hermes’ Codex App-Server Runtime: What It Changes for Everyday Development

Hermes Agent’s Codex App-Server Runtime is an optional bridge between two tool worlds: Hermes’ long-running agent shell and OpenAI Codex CLI’s local app-server runtime. When it is enabled, Hermes stops running the OpenAI/Codex turn through its own tool loop and hands that turn to codex app-server. Codex then handles terminal commands, file edits, patch application, sandboxing, approval requests, and native Codex plugins. Hermes stays around as the outer shell: sessions, slash commands, gateways, memory and skill review, cron/kanban orchestration, and selected richer tools through an MCP callback.

The short version: this feature lets a developer keep using Hermes as the daily command center, while letting OpenAI/Codex-model work run inside the same runtime Codex CLI and the Codex app experience use.

What the feature does

The official Hermes documentation describes the runtime as opt-in only. Nothing changes until the user turns it on. Once enabled for openai/* or openai-codex/* turns, the model gets three tool surfaces:

Codex built-ins: shell for terminal work, file reads and searches; apply_patch for structured edits; update_plan for planning; view_image for local image inspection; and Codex’s own web_search when configured.
Native Codex plugins: plugins already installed through Codex, such as GitHub, Linear, Gmail, Google Calendar, Outlook, and Canva, are discovered and written into Codex’s config for use inside the Hermes session.
Hermes callback tools: Hermes registers a stdio MCP server named around its own tool callback. Codex can call back into Hermes for tools Codex does not ship with, including Hermes web extraction, browser automation, vision analysis, image generation, skill reading, and text-to-speech.

That split matters. A file edit is no longer “Hermes asks the model to call Hermes’ patch tool.” It becomes “Codex applies the edit through its own app-server protocol and sandbox, while Hermes records and projects the result back into its normal transcript shape.”

How it works under the hood

OpenAI’s Codex app-server documentation says codex app-server speaks a JSON-RPC-style protocol over transports such as stdio, websocket, or unix socket. The Hermes implementation uses stdio. A Hermes turn creates or reuses a CodexAppServerSession, performs the app-server initialize handshake, starts a Codex thread, starts a turn, then listens for streamed item/* and turn/* events until completion.

The source code in Hermes’ agent/codex_runtime.py confirms the key design choice: run_codex_app_server_turn hands a full turn to the Codex subprocess and then projects Codex events back into Hermes’ message list. That projection is not cosmetic. It is how Hermes’ session database, memory review, and skill review can still see a normal-looking conversation even though the tool execution happened inside Codex.

Enabling the runtime also performs a config migration. Hermes writes a managed block in ~/.codex/config.toml, migrates Hermes MCP servers into the TOML shape Codex expects, registers the Hermes tools callback, and sets default_permissions = ":workspace" so normal workspace writes do not prompt for every single operation. Content outside the managed block is preserved.

Why it exists

The feature solves a practical gap. Hermes is provider-agnostic and has its own mature tool dispatch. Codex CLI has a strong local runtime for OpenAI coding workflows: subscription-based ChatGPT authentication, local sandboxing, patch application, app-server events, and a plugin system. Developers who use both do not want to choose between them every time they start work.

The GitHub pull request that introduced the feature describes it as Hermes’ answer to a request to route OpenAI agents through Codex, while keeping it opt-in because Hermes supports many non-OpenAI providers and has a large test suite tied to its own dispatch path. That is the logic behind the design: keep Hermes’ default behavior stable, but allow OpenAI/Codex users to swap in Codex’s runtime where it is strongest.

The demand behind the feature

The demand is easy to recognize if you work with coding agents daily:

Developers want to use their ChatGPT plan. OpenAI’s Codex README recommends signing in with ChatGPT so Codex can be used as part of Plus, Pro, Business, Edu, or Enterprise plans. Hermes’ runtime lets those Codex turns flow through that path instead of requiring a separate API-key-only workflow.
Developers want better local control. Codex app-server exposes threads, turns, items, sandbox settings, approval policies, command execution, MCP calls, plugin installation, and plugin listing. Hermes can wrap that runtime rather than reimplementing every Codex-specific behavior.
Developers want one agent shell across channels. Hermes runs in the terminal and messaging gateways. The Codex runtime lets the same Hermes chat surface use Codex’s local tools for code work while still keeping Hermes features such as sessions, slash commands, kanban workers, and skill review.
Developers want plugins without reconfiguration. If GitHub or Calendar is already authorized in Codex, Hermes can discover installed Codex plugins through Codex’s plugin/list RPC and activate them in the Codex runtime session.

How a developer turns it on

The prerequisites are concrete:

npm i -g @openai/codex
codex --version
codex login

Hermes’ docs mention Codex CLI 0.130.0 or newer in the user-facing setup guide; the current npm metadata queried on 2026-05-20 showed @openai/codex at 0.132.0. After Codex is installed and authenticated, enable the runtime inside Hermes:

/codex-runtime codex_app_server

Useful companion commands:

/codex-runtime        # show current state
/codex-runtime auto   # return to Hermes default runtime
/codex-runtime on     # synonym for codex_app_server
/codex-runtime off    # synonym for auto

Manual config is also supported:

model:
  openai_runtime: codex_app_server

The switch takes effect on the next session. That delay is deliberate: Hermes avoids changing the model/tool runtime in the middle of a cached turn.

What it feels like in daily work

Consider a normal debugging request:

Fix the failing checkout tests, inspect the related GitHub issue, and leave a short note explaining the change.

With the Codex runtime enabled, Codex can use shell to run the test suite, inspect files, and search the repository. It can use apply_patch to make a multi-file fix. If the GitHub Codex plugin is installed and authorized, it can inspect the issue or pull request through Codex’s plugin surface. If it needs a browser reproduction path, it can call back into Hermes’ browser automation through MCP. Hermes still shows the interaction in its session, keeps slash commands available, and records enough projected tool activity for later memory or skill review.

Another example is a product engineer working from a chat gateway. They can message Hermes from Feishu or Telegram, ask it to check a Linear ticket, patch the repo, run tests, and summarize the result. Codex performs the local code work and plugin calls. Hermes remains the gateway, session owner, and long-term workflow layer.

Where it is useful

OpenAI/Codex-heavy coding: projects where the developer already trusts Codex CLI for local edits, tests, and patching.
Subscription-based usage: teams or individuals who prefer ChatGPT-plan authentication for Codex instead of separate API usage for every coding turn.
Plugin-centered workflows: work that jumps between code, GitHub, Linear, Gmail, Calendar, or design tools.
Gateway-based development: running a real coding agent from chat while preserving Hermes’ channels, sessions, commands, and review loop.
Kanban worker dispatch: Hermes documentation states kanban workers can run on the Codex runtime; the callback exposes kanban completion and status tools so workers can report back to the board.

What not to use it for

It is not a universal replacement for the default Hermes runtime. Some Hermes tools require the live AIAgent loop context and cannot be driven by a stateless MCP callback. The official list is: delegate_task, memory, session_search, and Hermes’ todo. Codex has its own update_plan, but that is not the same as Hermes’ todo store. If a task depends on subagents, cross-session search, or direct memory writes inside the turn, switch back with /codex-runtime auto.

Cron is also described cautiously in the docs: it should follow the same availability rules because cron runs through AIAgent.run_conversation, but it was not specifically tested in that document. For unattended jobs that depend on Hermes-only loop tools, a default-runtime profile is safer.

Approvals and safety

Codex can request approval before executing commands or applying patches. Hermes translates those requests into its standard dangerous-command prompt: allow once, allow for the session, or deny. Codex permission profiles still matter. Hermes defaults to the workspace profile when enabling the runtime, which allows writes inside the current workspace while preserving sandbox boundaries. Codex also has read-only and no-sandbox profiles; the latter is explicitly not something to use casually.

The core logic in one sentence

Hermes’ Codex App-Server Runtime is a controlled handoff: keep Hermes as the durable agent shell, but let Codex own OpenAI/Codex turns when Codex’s local runtime, sandbox, plugins, and ChatGPT authentication are the better execution layer.

Sources

More from WayDigital

Continue through other published articles from the same publisher.

上一篇Hermes 的 Codex App-Server Runtime：把 Codex 的本地运行时接进 Hermes2026-05-20 04:26 UTC 下一篇AI Daily Digest — 2026-05-202026-05-20 00:06 UTC

Hermes’ Codex App-Server Runtime: What It Changes for Everyday Development

Hermes’ Codex App-Server Runtime: What It Changes for Everyday Development

What the feature does

How it works under the hood

Why it exists

The demand behind the feature

How a developer turns it on

What it feels like in daily work

Where it is useful

What not to use it for

Approvals and safety

The core logic in one sentence

Sources

More from WayDigital

Comments