Skip to content

ADR-005: Mercury Key Exchange

Date Author Status
2026-05-09 Fabian Beyerlein Accepted

Context

Mercury agents establish a secure application-layer channel to Nexus on top of gRPC/TLS. The channel has two cryptographic concerns:

Identity (enrollment): At first boot, Mercury generates a keypair, registers the public key with Nexus, and receives Nexus's public key in return. These long-lived keys anchor the agent's identity.

Session (key exchange): Before each encrypted session, Mercury performs an ephemeral key exchange with Nexus to derive a session key for ChaCha20-Poly1305 AEAD. Session keys are rotated periodically.

Since Mercury has not yet been deployed to production, there are no existing enrolled agents, no stored identity material to migrate, and no wire compatibility constraints. This ADR defines the cryptographic design from scratch.

Cryptographic context

NIST finalised FIPS 203 (ML-KEM), FIPS 204 (ML-DSA), and FIPS 205 (SLH-DSA) in August 2024. The relevant threat is Harvest Now, Decrypt Later (HNDL): ciphertext recorded today can be decrypted retroactively once a cryptographically relevant quantum computer (CRQC) exists. X25519 is broken by Shor's algorithm.

The symmetric layer (ChaCha20-Poly1305 with 256-bit keys) is unaffected — Grover's algorithm halves effective key strength to ~128-bit, which is acceptable.

This ADR covers the application-layer key exchange only.

Urgency ranking:

  • High: X25519 key exchange → HNDL risk on session and identity-bound material
  • Lower: Ed25519 signing → only at risk if a CRQC exists at signing time; signatures are not subject to HNDL

crypto/mlkem is available in Go 1.24+. crypto/mldsa is expected in Go 1.27 (est. Aug 2026); until then, Ed25519 is retained for signing as it does not carry HNDL risk.

Decision

Design the enrollment and key exchange protocol using ML-KEM-1024 for all key agreement, with Ed25519 for signing (to be replaced by ML-DSA-65 once crypto/mldsa ships in Go 1.27).

The static-static X25519 DH contribution used in the previous design sketch is not implemented. It provided implicit mutual authentication, which is instead covered by an explicit Ed25519 signature over the key exchange transcript.

Enrollment

Mercury generates two keypairs at first boot:

Role Algorithm Private key Public key
Signing Ed25519 64 bytes 32 bytes

Server keypairs are per agent identity, rather than per-org or global. This bounds the blast radius of key compromise to a single agent and collapses key rotation into the existing re-enrollment flow - revoking and re-issuing an agent identity is the rotation mechanism. Org-level or global keys would require separate rotation protocol (overlap window, key distribution, client-side fallback during transition) which re-enrollment already provides for free.

Enrollment bootstrap

Enrollment requires a one-time PASETO v4.public token generated in the STREAM UI by an authenticated operator. The token is signed by the Solaris server key and encodes the target org/agent context. Mercury presents this token in the enrollment request; Nexus verifies it through Solaris before accepting any key material. This is the trust anchor that prevents arbitrary agents from enrolling. Infrastructure constraints mean that Nexus can only accept TLS connections in deployed environments. For the enrollment, this trusted channel is sufficient.

No signature verification is performed during enrollment — the PASETO token is the only authentication material available.

PASETO tokens carry a short exp (24 hours). Unconsumed, expired enrollments are cleaned up by a background job that deletes the enrollment and cleans up the dangling servers keys.

Enrollment request

Mercury sends it's public key to Nexus alongside the PASETO token:

message EnrollRequest {
  string token      = 1;
  bytes  public_key = 2;
}

Nexus proxies the enrollment request to Solaris. Solaris verifies the PASETO token, accepting the enrollment and creating the Agent Identity in the database. On success, Solaris returns its own public key which Nexus returns to Mercury.

message EnrollResponse {
  string agent_id          = 1;
  string identity_id       = 2;
  bytes  server_public_key = 3;
  string name              = 4;
  string org_id            = 5;
}

Solaris needs to ensure no TOCTOU exploits exist in the enrollment process by using the existing distributed locking mechanism based on Redis. Upon parsing the PASETO token, Solaris creates a lock on the encoded EnrollmentID and only continues if the lock could be acquired. The locking mechanism is implemented to use atomic operations using Lua. While the lock is held and before the enrollment is processed further, Solaris needs to mark the enrollment as "consumed" to prevent future re-use. The consumed-mark and the agent identity creation are committed in the same database transaction; the lock is released after commit.

Stored identity (Mercury, bbolt)

type Identity struct {
    ID              string
    AgentID         string
    OrgID           string
    PrivateKey      ed25519.PrivateKey // 64 bytes
    PublicKey       ed25519.PublicKey  // 32 bytes
    ServerPublicKey ed25519.PublicKey  // 32 bytes
    CryptoVersion   CryptoVersion      // currently: 1
    CreatedAt       time.Time
    Status          EnrollmentStatus
}
type CryptoVersion uint8

const (
  CryptoVersion_Ed25519   = 1
  CryptoVersion_ML_DSA_65 = 2
)

CryptoVersion is stored from day one to allow future key type changes (e.g. ML-DSA adoption) without ambiguity.

Key Exchange

Mercury initiates a key exchange before each session. The exchange is purely ephemeral — no static key material contributes to the session key. Mutual authentication is provided by transcript signatures instead.

Mercury uses a stream interceptor to inject a authorization header into the metadata. Nexus uses a stream interceptor to verify the identity of the Mercury instance trying to initiate a stream. Additionally, the stream init handshake contains a challenge field to act as authentication of the ephemeral key exchange it self.

Both authentication methods use the same concept: a PoP string.

The PoP string has the following format:

PoP kid=<identity id>,sig=<signature>,ts=<unix timestamp>,nonce=<random nonce>

Stream authentication using the authorization header generally uses the following signature challenge format. A signature over this transscript is placed in the sig field of the PoP string.

<request method>|<unix timestamp>|<random nonce>[|<base64 encoded sha256 of body>]

The key exchange challenge instead creates a signature of all fields in the protobuf message (except the chalenge itself) and places it in the sig field of the PoP string.

Protocol

Mercury → Nexus:

message StreamInitRequest {
  string challenge                       = 1;
  bytes  ephemeral_x25519_public_key     = 2;
  bytes  ephemeral_kem_encapsulation_key = 3;
}

Nexus → Mercury:

message StreamInitResponse {
  string session_id                  = 1;
  string challenge                   = 2;
  bytes  ephemeral_x25519_public_key = 3;
  bytes  ephemeral_kem_ciphertext    = 4;
}

Authentication

Request and response authentication is provided by the session AEAD using ChaCha20-Poly1305. The stream protocol is adjust to encrypt the entire CloudMessage (or AgentMessage). This provides authentication over all fields (i.e. MessageID, Control, Payload) in addition to providing secrecy. Because the session key is derived from a mutually authenticated key exchange, the AEAD tag on each message transitively binds it to the long-lived agent identity.

The stream message frames are constructed as follows:

message AgentMessageFrame {
  oneof body {
    StreamInitRequest init              = 1;
    bytes             encrypted_payload = 2; // AgentMessageEnvelope
  }
}

message CloudMessageFrame {
  oneof body {
    StreamInitResponse init              = 1;
    bytes              encrypted_payload = 2; // CloudMessageEnvelope
  }
}

Only a StreamInitRequest/StreamInitResponse can be sent unencrypted. Both Nexus and Mercury enforce encryption/decryption from/to encrypted_payload in the tunnel implementation.

Per-message signatures are deliberately omitted. They would add non-repudiation, which is not required: the audit trail lives in the database, not in cryptographic message logs.

Session key derivation

dh_secret  = X25519(local_eph_priv, peer_eph_dh_pub)    // 32 bytes
kem_secret = DecapsulationKey.Decapsulate(kem_ct)       // 32 bytes
ikm  = dh_secret || kem_secret                          // 64 bytes

info = "stream/mercury-session/v1" || agent_id || H(mercury_dh_pub || mercury_kem_ek || nexus_dh_pub || nexus_kem_ct)
key  = HKDF-SHA256(ikm=ikm, salt=nil, info=info, len=32)

The hybrid combiner (concatenation into HKDF) means the session key requires breaking both X25519 classically and ML-KEM lattice security. If either holds, the session key is safe.

Session duration

Session duration is configurable per deployment. Default: 1h. Maximum enforced by Mercury: 12h. Minimum enforced by Mercury: 5m. Lower values are permitted; customers with stricter requirements may shorten the rotation interval.

AEAD nonce: 12 bytes from crypto/rand per message. Collision probability is negligible at any realistic message rate within the 12h session ceiling (birthday bound ~2⁴⁸ messages per session key).

Wire sizes

Direction Contents Size
Mercury → Nexus eph_dh_pub + kem_ek 32 + 1568 = 1600 bytes
Nexus → Mercury eph_dh_pub + kem_ct 32 + 1568 = 1600 bytes

Signing key upgrade path

When crypto/mldsa ships in Go 1.27 (est. Aug 2026), replace Ed25519 with ML-DSA-65:

Ed25519 ML-DSA-65
Public key 32 bytes 1952 bytes
Signature 64 bytes 3309 bytes
PoP header (base64) ~200 bytes ~4.4 KB

CryptoVersion = 2 in the identity store marks ML-DSA identities. Verify that gRPC metadata limits on Nexus accommodate the larger PoP header before adopting (default gRPC max header is 8KB; configurable).

Consequences

Positive

  • Session keys are quantum-resistant from day one via hybrid X25519 + ML-KEM-1024.
  • No static key material in session derivation — forward secrecy is unconditional.
  • Explicit PoP interceptors provide stronger mutual authentication than the previous implicit static-DH approach, and neither check is bypassable in application code.
  • No migration complexity; greenfield implementation.
  • All primitives are Go stdlib (crypto/mlkem, crypto/ecdh, crypto/hkdf, crypto/ed25519).

Negative

  • Key exchange payload is ~1.6KB per direction vs ~64 bytes for a plain X25519 exchange. Confirm no infrastructure component between Mercury and Nexus (NATS limits, any proxy buffer) rejects payloads of this size.

References