Graphiant Post Quantum Cryptography

Prev Next

NIST Post-Quantum Cryptography Standards

In August 2024, NIST finalized the first three post-quantum standards.

These replace the RSA and ECC primitives that underpin virtually all encrypted communications in production today.

The Standards

  • FIPS 203 — ML-KEM (Module Lattice Key Encapsulation Mechanism):

    • Replaces ECDH/DH for key agreement

    • Three parameter sets: ML-KEM-512, ML-KEM-768, ML-KEM-1024

    • Graphiant deploys ML-KEM-1024 (NIST Security Level 5)

  • FIPS 204 — ML-DSA (Module Lattice Digital Signature Algorithm):

    • Replaces ECDSA/RSA for digital signatures and certificate signing

    • Three parameter sets: ML-DSA-44, ML-DSA-65, ML-DSA-87

  • FIPS 205 — SLH-DSA (Stateless Hash-Based Digital Signature Algorithm):

    • Conservative hash-based signatures

    • Larger but relies only on hash-function security

    • Used as a fallback if lattice-based schemes face new attacks

  • HQC (selected March 2025):

    • Code-based KEM providing algorithm diversity against lattice-specific breakthroughs

NIST Timeline

RSA and ECC are slated to be deprecated by 2030 and disallowed by 2035.

CNSA 2.0 will mandate PQC for all national security systems by 2030.

Civilian agencies must comply by 2035.

The EU NIS2 directive and DORA regulation impose similar timelines on European financial institutions.

With recent developments—especially driven by AI and advances in quantum error correction—the timeline is accelerating. The number of qubits required to break modern encryption has dropped dramatically, and what was once thought to require millions of qubits may now be achievable with tens of thousands to a few hundred thousand.

As a result, many industry leaders are now projecting that the timeline could move up to as early as late 2028 to 2029, making the urgency of upgrading to post-quantum security critically important for the business.

PQC Key and Certificate Size Analysis

The defining characteristic of PQC primitives is size inflation. While symmetric algorithms (AES-256, ChaCha20) are unaffected by quantum computers and remain at 256-bit key lengths, the asymmetric primitives used for key exchange, signatures, and certificates grow by one to two orders of magnitude.

Artifact

Classic (ECC)

PQC (ML-KEM/DSA)

Growth

Impact

KEM public key

32 B (X25519)

1,568 B (ML-KEM-768)

49×

High

Digital signature

64 B (ECDSA)

3,309 B (ML-DSA-65)

52×

High

X.509 certificate

~600 B

~5,000 B

High

Full cert chain

~1.5 KB

~12 KB

Very High

IKE/TLS handshake

~1.5 KB

~15 KB

10×

Critical

Symmetric key (AES-256)

32 B

32 B

None

AEAD tag

16 B

16 B

None

IPsec ESP overhead

~20 B

~20 B

None

The bottom half of the above table is the critical insight:

  • Symmetric keys, AEAD tags, and per-packet ESP overhead are completely unaffected by PQC.

  • The entire scaling problem is in the asymmetric handshake and certificate exchange, not in the data plane.

The N-Squared IKE Scaling Problem

IKEv2 between two endpoints performs mutual authentication (certificate exchange, signature verification) and a Diffie-Hellman or ML-KEM key exchange to derive a shared secret. Each pair of endpoints that needs to communicate runs this handshake independently.

In a full mesh of “N” sites, the total number of independent IKE sessions is “N(N-1)/2”. Under classic ECC, the handshake is small (a few hundred bytes) and computationally inexpensive. Under PQC, each handshake is 10 to 50 times larger, fragments over UDP, requires multiple round-trips, and consumes significantly more CPU for signature verification.

Site Count

Classic IKE Sessions (N²)

Graphiant Enrollments (N)

Reduction

10

45

10

4.5×

100

4,950

100

49.5×

500

124,750

500

249.5×

1,000

499,500

1,000

499.5×

What Fails at Scale

  • UDP fragmentation storms:

    • PQC payloads exceed typical MTU sizes.

    • Middleboxes silently drop fragments, causing handshake failures.

  • Cold-start latency:

    • Traffic waits on a multi-round-trip PQC handshake before any data flows.

    • Applications experience connection timeouts.

  • Rekey thundering herd:

    • N-squared handshakes re-run on every certificate rotation or crypto-profile change.

    • The burst can overwhelm gateway CPU.

  • Blast radius:

    • Each Edge holds keying material for every peer.

    • A single compromise affects all pairwise sessions.

Hub-and-Spoke IPsec at Thousands of Sites

Enterprises often avoid the full-mesh N-squared problem by deploying a hub-and-spoke IPsec topology:  a central headend gateway (or a small cluster of regional hubs) terminates every remote-site tunnel.  This reduces the tunnel count from N(N−1)/2 to N, making it appear that the scaling concern disappears. In practice, hub-and-spoke under PQC concentrates the problem rather than solving it.

The headend becomes the bottleneck.  Every remote site runs a full PQC IKE handshake with the hub. At 2,000 sites, the hub must handle 2,000 concurrent IKE sessions, each exchanging ~15 KB of PQC payload (ML-KEM public keys, ML-DSA signatures, and PQC certificate chains). That is 30 MB of handshake traffic at initial bring-up, plus 10–30 times the CPU cost per signature verification compared to classic ECDSA.  During a rekey window — which must be periodic and may be triggered by policy change or certificate rotation — all 2,000 sessions re-handshake simultaneously.  A single headend gateway must absorb the full burst.

Memory per peer scales linearly but expensively.  Each spoke requires the hub to maintain an IKE SA, one or more IPsec Child SAs (inbound and outbound SPIs), the remote peer’s PQC certificate chain (~12 KB), cached KEM material for rekeying, and traffic selector / SPD entries.  At 2,000 peers, the hub holds tens of gigabytes of SA and certificate state.  Traditional headend appliances (Cisco ASA, Fortinet FortiGate, Palo Alto) were designed for classic-size state entries;  PQC inflates per-peer state by 8–10 times.

New-session rate becomes the true ceiling.  Hub-and-spoke IPsec gateways are rated by their new-sessions-per-second (NSPS) metric.  Under classic IKE, a high-end gateway can establish 5,000–20,000 new IKE SAs per second.  Under PQC, ML-DSA signature verification alone is 10–30 times slower, reducing effective NSPS to 500–2,000.  A 3,000-site enterprise that reboots after a maintenance window (or after a certificate authority re-issuance event) will take 90 seconds to 6 minutes to fully converge — during which some remote sites have no encrypted connectivity.

High availability requires state synchronization.  Active/standby hub clusters must synchronize all SA state to the standby.  Under PQC, the state set is 8–10 times larger, increasing failover synchronization time and the memory footprint of the standby.  Stateful failover becomes a capacity-planning problem, not just a reliability feature.

Inter-spoke traffic hairpins through the hub. In a pure hub-and-spoke topology, any traffic between two remote sites must traverse the hub twice (spoke A → hub → spoke B), doubling WAN bandwidth consumption and adding latency.  Some vendors offer “shortcut” or “dynamic spoke-to-spoke” tunnels that fall back to direct IKE between spokes on demand — but this reintroduces the pairwise PQC handshake problem for every spoke pair that communicates, negating the hub-and-spoke simplification.

Remote Sites

Hub IKE SAs

Rekey Burst (all)

Convergence at PQC NSPS

500

500

~7.5 MB PQC traffic

15–60 sec

1,000

1,000

~15 MB PQC traffic

30 sec–2 min

2,000

2,000

~30 MB PQC traffic

1–4 min

5,000

5,000

~75 MB PQC traffic

2.5–10 min

10,000

10,000

~150 MB PQC traffic

5–20 min

The Graphiant contrast:  Graphiant’s controller-driven model avoids the hub-and-spoke bottleneck entirely.  Each Edge enrolls once with the controller (not with a central gateway), and BGP distributes key material through Route Reflectors — a protocol designed for exactly this scale.  There is no headend gateway to saturate, no centralized SA state to synchronize, no rekey thundering herd against a single device, and no hairpin for inter-site traffic.  The core is stateless and forwards on labels;  the Edges hold only their own SA table.  At 5,000 or 10,000 sites, the architecture behaves identically to 50 sites — the cost is linear, not concentrated.  

The PQC Certificate-Size Ceiling

The scaling problem is not limited to site-to-site VPN.

Any protocol that performs per-session asymmetric handshakes faces the same ceiling under PQC.

Three classes of infrastructure are particularly affected:

PQC-Capable IPsec Gateways

A headend IPsec gateway upgraded to support PQC must handle a ~15 KB handshake per remote peer.  At 1,000 peers, the gateway absorbs 15 MB in PQC handshake traffic alone, with 10 to 30 times the CPU cost for each signature verification.  New-session rate, rekey amplification, and per-peer memory (IKE SA plus certificate chain plus rekey material) all become bottlenecks.

PQC-Capable TLS Proxies

TLS proxies face a structurally worse problem:  state is per-application, per-endpoint — not per-host.  A proxy serving 500 applications to 200 users maintains 100,000 TLS sessions, each carrying its own PQC certificate chain.  Certificate pinning per application, session ticket bloat carrying PQC material, and the inability to pool sessions across apps make TLS proxy scaling especially brittle.  For mainframe environments using AT-TLS, each policy rule (of which a large bank may have 5,000 to 15,000) requires individual reconfiguration with a PQC certificate and cipher suite.

While TLS is ideally built for end-to-end security and per-application policy enforcement, the per-application proxy model will not scale during a rapid PQC migration.  The timeline does not allow for a measured, application-by-application transition — Q-Day is getting closer, and the migration window is shrinking.  An IPsec model combined with an SDN architecture can solve many of the scaling and transition challenges.  Operating at Layer 3 is far more practical for this transition:  it encrypts all traffic below the application layer in a single pass, requires no per-application certificate rotation or policy reconfiguration, and can be deployed in hours rather than years.  Enterprises can protect everything with network-layer PQC IPsec immediately, then migrate individual applications to native PQC TLS on their own schedule as vendors ship support and the threat timeline allows.

IoT

Constrained IoT devices (MCUs with 256 KB flash and 64 KB RAM) cannot store a ~12 KB PQC certificate chain, much less perform ML-DSA signature verification in memory.  Radio airtime for a PQC handshake drains battery on battery-powered sensors.  Network-layer encryption, applied upstream at an edge device, is the only viable path for these endpoints.

The Symmetric Exception

The only PQC artifacts that did not grow are the symmetric-side operations: AES-256 traffic keys (32 bytes), AEAD authentication tags (16 bytes), IPsec ESP record overhead (~20 bytes), per-packet AES-NI CPU cost, and HKDF key derivation.  

The symmetric data plane is quantum-safe by default.

The entire scaling problem lives in the asymmetric handshake and certificate exchange.

Graphiant’s Controller-Driven IKE with BGP Key Distribution

Graphiant’s architecture collapses the N-squared handshake problem into an N-scale enrollment plus an eventually-consistent distribution fabric.

IKE is not eliminated; it is relocated.

The controller becomes the sole IKE anchor, and BGP becomes the key distribution plane.

The Four Steps

  • Step 1 — Edge enrolls with controller:

    • Each Graphiant Edge enrolls once with the controller over a PQC-secured channel (ML-KEM for key agreement).

    • Attestation (TPM, measured boot), identity binding, policy class, VRF membership, SLA assignment, and Flex-Algo topology are evaluated centrally.

    • Certificate validation in these IKE sessions (Edge to ODP servers and Core devices) uses traditional cryptography:

      • The Edge’s GAK certificate, issued by GCS, is validated against a chain that terminates in the Graphiant Root Certificate.

        • Note:  Making the authentication path quantum-resistant would require re-issuing the full chain and GCS changes to support ML-DSA in X.509 — standards for which are very new and not yet supported by the off-the-shelf components GCS relies on. Because this phase targets harvest-now, decrypt-later only, PQC authentication is deferred to a later phase.

  • Step 2 — Controller publishes Edge material:

    • The controller issues the Edge a short-lived identity plus the key material it will advertise to peers: a hybrid ECC and ML-KEM public key (not signed), tunnel endpoints, SPIs, and crypto profile references.

    • This material is injected into BGP.

  • Step 3 — BGP floods authorized peers:

    • Route Reflectors distribute each Edge’s public PQC material through the BGP control plane.

    • Only peers authorized by policy (matching VPN membership) accept and install the key exchange data (the Dynamic IPsec Material, or DIM).

    • RT constraint helps with key distribution scale.

  • Step 4 — Local key derivation, no peer handshake:

    • The receiving Edge runs a KEM encapsulation against the peer’s public key, derives the pairwise traffic key, and programs the IPsec SA locally.

    • No round-trip with the peer.

    • No IKE_SA_INIT. No IKE_AUTH.

What the Controller Still Does

The controller’s responsibilities, now centralized rather than pairwise:

  • Authentication and attestation: Certificate validation (using traditional cryptography today; PQC X.509 is a later phase), platform attestation, identity binding. Performed N times, not N-squared.

  • Identity issuance: Short-lived Edge credentials are issued from the Graphiant portal and signed using traditional cryptography (ECDSA with ECC P384, or ECC P256 where the Edge’s TPM does not support P384). Rotation is a local re-enroll.

  • Policy evaluation: Who may talk to whom, per-VPN/VRF. This gates whether a peer’s BGP-advertised key material is installable.

  • Key lifecycle and rekey: Cadence, epoch rollover, group versus pairwise material. Orchestrated centrally, applied by BGP update.

  • Crypto agility: The profile is still negotiated, but the controller advertises only a single proposal at a time: hybrid ECC P384+ML-KEM, AES-GCM-256, etc. Eliminates downgrade attacks.

  • Revocation: Compromised Edge identity is invalidated; BGP withdrawal propagates the revocation to all peers without pairwise coordination.

  • Data-assurance binding: Every installed SA is tied to an observable flow identifier, making tunnels verifiable end-to-end.

Why BGP Is the Right Distribution Fabric

  • Already the largest MPLS and Internet as a control plane — reuse, not reinvent.

  • Route Reflector hierarchy scales to hundreds of thousands of sessions with decades of operator experience.

  • Withdrawal is immediate and viral — revocation propagates without pairwise coordination.

  • Policy-filtered:  peers only see material they are authorized to install.

  • Key size does not matter:  BGP is designed for large update payloads.  The compute capacity and scale of the protocol absorb PQC material sizes without structural change.

The Stateless MPLS Core

Graphiant operates a private MPLS backbone where stateless core routers hold no per-flow state: no session tables, no flow cache, no VRF lookups, and no customer cryptographic keys.

Every forwarding decision is derived from the metadata label carried by the packet itself.

The Metadata Label

Each packet carries a two-part label:

SR-MPLS Portion (Core Forwarding Instructions)

  • Egress core label:  Identifies the destination Graphiant PoP or Edge in the core — the “where to exit” instruction

  • SLA class:  Gold, Silver, Bronze, or Default — selects the per-class queue and path treatment along the core

  • Constrained topology (Flex-Algo, RFC 9350):  Binds the packet to a specific computed topology — geo-fenced (avoid CN/RU), low-latency, or compliance-bound path

IPv6 Portion (End-to-End Context)

  • Endpoint ID:  Uniquely identifies the originating customer endpoint.  Decoupled from routing, used only at the Edge

  • VPN membership:  Tenant or VRF context.  Isolation is enforced at the Edge, not by per-VRF state inside the core

  • QoS classification:  Application-aware class marker set by the source Edge, used for remarking or policing at the far Edge

  • Remote Edge link identifier: Identifies the exact egress link on the destination Edge, removing the need for a lookup at the far side

Security Implications

Because the core is stateless, a compromised router sees only label-switched traffic — it cannot read, replay, or redirect customer data.

There are no session keys to extract, no VRF tables to exfiltrate, and no flow cache to inspect.

This is a fundamental architectural advantage:  the attack surface of the core is reduced to packet forwarding, not traffic inspection.

Scaling Through Statelessness

IPsec encryption creates a combinatorial explosion of security associations (SAs).  A traditional IPsec fabric with N sites needs N(N−1)/2 tunnel pairs, each with its own SA state.

Graphiant’s stateless core reduces this to just two tunnels per Edge (one in each direction to the core), with the SA state held only at the Edges.

The core forwards labeled packets without maintaining any SA state, enabling very large numbers of encrypted flows through a fixed-state-cost core.

High Availability

Because the core holds no control-plane state for any individual flow, a single proxy or core router failure does not require re-convergence of session state.  Traffic is rerouted via standard MPLS fast-reroute.  The Edge devices re-derive the path from the metadata label.  There is no IPsec SA teardown/rebuild cycle, no BGP withdrawal for individual flows, and no session-table synchronization to a standby.

In the case of an IPsec proxy failure, recovery is straightforward through simple rerouting.  All remote sites can be configured with primary and backup proxies.  Since there are no persistent IPsec peer relationships or IKE re-establishment required, traffic can seamlessly fail over.  Keeping all proxy devices connected as primary and backup provides high availability, even at the edge device level.

Flex-Algo as a Plug-and-Play Compliance Platform

Flex-Algo (RFC 9350) allows the Graphiant controller to compute constrained topologies that satisfy regulatory requirements.  A packet bound to Flex-Algo topology 128 (for example, “no transit through China or Russia”) will only traverse core nodes and links that are part of that topology.  The enforcement is in the label:  the constrained topology selection field in the SR-MPLS portion of the metadata label.

This turns a complex geo-fencing or compliance routing problem into a single label-stack field.  New compliance requirements (data sovereignty, low-latency financial corridors, government-only transit) are expressed as new Flex-Algo topologies computed by the controller and distributed via BGP. No per-site configuration is required.

Micro-Segmentation

Graphiant’s micro-segmentation is enforced at the Edge through the VPN membership and endpoint ID fields in the IPv6 metadata.  Each tenant, VRF, or application group is isolated at the network layer.  The controller’s policy evaluation determines which Edges can install which peers’ key material via BGP, making segmentation a control-plane decision rather than a data-plane inspection function.

This means an Edge device in a retail branch cannot reach a financial services VPN even if the same physical core routers carry both tenants’ traffic.  The segmentation is cryptographic (different SA per VPN) and topological (different Flex-Algo topology), not just tag-based.

Data Assurance and Observability

Data assurance in Graphiant provides end-to-end flow-level visibility across the fabric.  Every installed IPsec SA is tied to an observable flow identifier.  The system provides automated detection of:

  • Breach events: unauthorized access attempts or flow anomalies.

  • Unwanted relay: traffic that transits an unexpected path or jurisdiction.

  • SLA violations: latency, jitter, or loss exceeding the contracted Gold/Silver/Bronze class thresholds.

  • Observability is exposed via OpenTelemetry, allowing integration with existing SIEM and SOC tooling.

For compliance-sensitive industries (banking, healthcare, government), data assurance provides the audit trail that regulators require:  proof that encrypted traffic followed the mandated path, through the mandated topology, with the mandated encryption strength.

Agentic AI and Distribuited Systems Connectivity

The next generation of enterprise AI workloads is a massive distributed systems problem.  AI agents operate across hyperscalers (AWS, Azure, GCP), neo-cloud providers (CoreWeave, Lambda), on-premises data centers, and service provider PoPs.  These agents require real-time, private, peer-to-peer connectivity to share model state, training data, and inference results.

Graphiant’s private fabric creates a single quantum-safe network that connects AI agents within the enterprise and across business partners.  The stateless core provides the throughput required for large model transfers.  The controller-driven IKE model means new agent endpoints can be enrolled and authorized in seconds, not hours.  Flex-Algo topologies can enforce data residency requirements (EU training data stays in EU) without application-layer changes.

B2B partner connections for agent-to-agent communication are secured by the same PQC IPsec fabric, with per-partner ZTNA policy and DLP applied at the Graphiant Edge.  This enables the emerging pattern of cross-enterprise AI collaboration without exposing the underlying data to transit networks.

Comparison: Traditional Model vs. Graphiant

Dimension

Traditional IPsec / TLS

Graphiant PQC

PQC handshakes

N(N−1)/2 pairwise

N enrollments to controller

Key distribution

Each pair runs IKE independently

BGP distributes via Route Reflectors

Cold-start latency

Multi-RTT PQC handshake per peer

SA pre-derived; no first-packet wait

Rekey

N² handshakes re-run

Controller-orchestrated BGP update

Revocation

N pairwise IKE deletes

Single BGP withdrawal propagates

Core state

Per-flow SA tables

Stateless label forwarding

Crypto agility

Peer-negotiated (downgrade risk)

Controller-dictated (no negotiation)

IoT / Constrained devices

Must perform full PQC handshake

Edge encrypts on behalf of device

TLS proxy

Per-app, per-endpoint state

L3 ecryption below all apps

Compliance routing

Application-layer enforcement

Flex-Algo label field

Observablitity

Per-device logs, no correlation

End-to-end OTel flow tracing

Graceful Restart

Graphiant’s stateless core simplifies graceful restart.  

Because core routers hold no per-flow state, a restarting Edge needs to re-derive only its own SA table from the BGP-advertised key material.   The core continues to forward labeled packets during the restart window.  There is no session-table synchronization with the core, no NSF (Non-Stop Forwarding) dependency on the core routers, and no SA blackout period.  The Edge’s BGP session with the Route Reflector carries a graceful restart capability, and the controller re-validates the Edge’s attestation upon reconnection.

Deployment Architecture

Graphiant’s PQC fabric deploys as an overlay on commodity infrastructure:

  • Data center:  A standard x86 server running Graphiant Edge software

    • Connects to the existing LAN/DC fabric

    • All server and mainframe traffic is encrypted at L3 before it leaves the DC

  • Branch:  A small CPE device (or software on existing hardware) running Graphiant Edge

    • All branch traffic — user, IoT, OT — is encrypted at L3

  • Cloud:  Graphiant virtual Edge in AWS, Azure, GCP, or neo-cloud (CoreWeave).

    • Direct PQC-encrypted on-ramp to cloud workloads

  • B2B partner:  SP-hosted or enterprise-hosted DMZ with per-partner ZTNA policy and DLP

    • PQC-secured partner connections.

The fabric operates over any underlay:  commodity broadband, 5G, MPLS, or direct interconnect.  It runs alongside existing SD-WAN, SASE, or multi-cloud architectures.  No rip-and-replace is required.

Stateless IPsec Placement

Because the IPsec processing is decoupled from the stateless core, the encryption function can be placed anywhere:  at the Edge CPE, on a server in the data center, or even directly on the server hosting the application.  This flexibility means that the same architecture that protects a 10,000-site enterprise WAN can also protect a single critical server running a COBOL batch processing job — without any change to the application.

Conclusion

The post-quantum transition creates a fundamental scaling problem for any protocol that relies on pairwise asymmetric handshakes.  Certificate sizes, signature sizes, and key exchange payloads grow by one to two orders of magnitude, while the threat window narrows to 2 to 4 years.  The per-application migration path takes 12 to 15 years.  Traditional IPsec and TLS cannot close that gap.

Graphiant’s architecture addresses this at the root:  controller-driven IKE reduces N-squared handshakes to N enrollments, BGP distributes key material at the scale of the routing plane, a stateless MPLS core eliminates per-flow state from the forwarding path, and Flex-Algo topologies enforce compliance without application-layer changes.

The result is a network that is quantum-safe from day one, scales to thousands of sites with linear cost, and allows enterprise applications to migrate to native PQC TLS on their own schedule — years or decades later — with the network already protected.