Every time you connect to a website over HTTPS, your browser and the server run a key exchange protocol to agree on a shared secret without revealing it to anyone listening on the network. Today, most deployments use X25519 (Diffie-Hellman over Curve 25519). That will be broken by a quantum computer running Shor's algorithm.

ML-KEM (FIPS 203, formerly CRYSTALS-Kyber) is its post-quantum replacement. It's a Key Encapsulation Mechanism: one party generates a key pair, the other "encapsulates" a random shared secret inside a ciphertext using the public key, and the first party "decapsulates" to recover the same shared secret. The result: two parties agree on a random secret, and an eavesdropper learns nothing.

Beginner's Intuition 💡

Think of Kyber (ML-KEM) as a magical, unbreakable lockbox system for the internet.

When your computer wants to securely talk to a website, it asks the website for an open, highly complex mathematical lockbox (the public key). Your computer puts a secret code inside, snaps the box shut, and sends it back.

Because the lockbox uses lattice math (Learning With Errors), any attacker—even one with a powerful quantum computer—just sees mathematical static when they look at the locked box. Only the website, which secretly holds the perfectly structured "key" (the private key), can cut through the static, open the box, and read your secret code to establish a secure connection.

The Mathematical Setting

Kyber operates in the polynomial ring $R_q = \mathbb{Z}_{3329}[x]/(x^{256}+1)$. Elements of this ring are polynomials with 256 coefficients, each an integer from 0 to 3328. Arithmetic is polynomial arithmetic modulo both $3329$ (for coefficients) and $x^{256}+1$ (to keep degree below 256).

Why $q = 3329$? It's the smallest prime of the form $512k + 1$, which makes the Number Theoretic Transform (NTT) maximally efficient for $n = 256$. The NTT turns polynomial multiplication from an $O(n^2)$ operation into an $O(n \log n)$ one — critical for performance.

Parameter Sets

Kyber comes in three sizes, trading off security for key/ciphertext size. The module rank $k$ controls how many ring elements are in the key vectors:

ML-KEM-512 (k=2)

~128 bits of security (NIST Level 1). Smallest keys.

Public key: 800 bytes
Secret key: 1,632 bytes
Ciphertext: 768 bytes

Suitable for constrained environments where bandwidth matters most.

ML-KEM-768 (k=3)

~192 bits of security (NIST Level 3). Recommended default.

Public key: 1,184 bytes
Secret key: 2,400 bytes
Ciphertext: 1,088 bytes

Google Chrome uses this (with X25519 hybrid) for all HTTPS connections.

ML-KEM-1024 (k=4)

~256 bits of security (NIST Level 5). Maximum security.

Public key: 1,568 bytes
Secret key: 3,168 bytes
Ciphertext: 1,568 bytes

Used where long-term data secrecy matters (classified, financial).

Step 1: Key Generation

The recipient generates a key pair. The public key is an LWE instance; the secret key is the hidden secret. Here's what happens, explained plainly:

ML-KEM Key Generation

1. Generate a random 32-byte seed $\rho$ and expand it deterministically into a matrix $A \in R_q^{k \times k}$ using SHAKE-128. This matrix is public — anyone can expand it from the seed.

2. Sample a secret vector $s \in R_q^k$ with small coefficients (drawn from the centered binomial distribution — each coefficient is between $-\eta$ and $+\eta$, where $\eta = 2$ or 3).

3. Sample a small error vector $e \in R_q^k$ similarly.

4. Compute $t = As + e$. This is the Module-LWE sample — a noisy linear function of the secret.

Public key: $(\rho, t)$ — 1,184 bytes for ML-KEM-768.
Secret key: $s$ — kept private.

Step 2: Encapsulation

The sender (who only has the public key) wants to generate a shared secret and send an encrypted version of it. The process looks like a second LWE sample, but with the roles of the matrix transposed.

ML-KEM Encapsulation

1. Sample a random 32-byte message $m$. Hash it together with the public key hash to derive deterministic randomness.

2. Sample a small secret $r$ and errors $e_1, e_2$.

3. Compute:
  $u = A^\top r + e_1$ (analogous to the public key step, transposed)
  $v = t^\top r + e_2 + \mathrm{encode}(m)$

4. The ciphertext is $(u, v)$, compressed to save bytes.
The shared secret is derived from $m$ and the public key hash.

The sender sends the ciphertext and uses the shared secret to encrypt application data (via AES-GCM or similar).

Step 3: Decapsulation

The recipient uses their secret key to recover the shared secret from the ciphertext. The math works because the LWE noise terms are small enough to cancel out during decryption.

ML-KEM Decapsulation — Why It Works

The recipient computes $v - s^\top u$:

$v - s^\top u = (t^\top r + e_2 + m') - s^\top(A^\top r + e_1)$

$= (As+e)^\top r + e_2 + m' - s^\top A^\top r - s^\top e_1$

$= e^\top r + e_2 - s^\top e_1 + m'$

The term $e^\top r + e_2 - s^\top e_1$ is small (all the vectors involved have tiny coefficients). Rounding this to the nearest multiple of $q/2$ recovers $m$ exactly, as long as the noise doesn't exceed $q/4$ — which the parameter choices guarantee with overwhelming probability.

Security: The FO Transform

The scheme above (Kyber.PKE) is only secure against passive eavesdroppers. An attacker who can send chosen ciphertexts and observe whether decryption succeeds or fails could potentially extract the secret key. To prevent this, Kyber applies the Fujisaki-Okamoto (FO) transform — a standard compiler that converts a passively secure scheme into one that's secure against active attackers (called CCA2 security).

The key idea: during decapsulation, the recipient re-runs encapsulation using the recovered message and checks that the result matches the received ciphertext. If it doesn't match (someone tampered with the ciphertext), a pseudorandom value derived from the secret key and the ciphertext is returned instead. This "implicit rejection" prevents attackers from learning whether decryption succeeded, closing the chosen-ciphertext attack vector.

Implementation Choices

Centered binomial distribution

Kyber avoids discrete Gaussian sampling — which is slow and notoriously hard to implement in constant time — and instead uses the centered binomial distribution $B_\eta$: sample $2\eta$ random bits, compute their alternating sum. This is fast, simple, and statistically close enough to Gaussian for security.

Constant-time implementation

Side-channel attacks read secret information from timing variations and power consumption. Kyber is designed so all operations run in constant time regardless of secret values: NTT operations have data-independent access patterns; the FO re-encryption check uses constant-time comparison; rejection uses implicit rejection (no branch on secret data).

Performance (AVX2, x86-64)

ML-KEM-768 on modern hardware:
KeyGen: ~14 μs
Encapsulate: ~18 μs
Decapsulate: ~17 μs

For comparison, X25519 key agreement: ~120 μs. Kyber is ~7× faster, despite being post-quantum secure and having larger keys.

Deployment status

Google Chrome has deployed X25519+Kyber hybrid key exchange for all TLS connections since late 2023. Cloudflare, AWS, and Signal have all deployed PQC key exchange. OpenSSL 3.x and BoringSSL include ML-KEM. The migration from classical to post-quantum key exchange is already underway at internet scale.

Kyber Key Generation — Step by Step

Here is the full key generation procedure at the level of concrete arithmetic, using ML-KEM-768 () as the running example. Every step maps directly to lines in the FIPS 203 specification.

Step 1 — Generate the seed. Sample 32 uniformly random bytes from the OS CSPRNG. Derive two 32-byte seeds by hashing: where . is the public seed for expanding ; is the private seed for sampling secrets.

Step 2 — Expand the matrix. Generate by calling SHAKE-128 with input for each entry . Each entry is a polynomial in with 256 uniformly random coefficients in . is generated directly in NTT domain, so no forward NTT is needed. The seed is public — anyone can re-derive from it.

Step 3 — Sample the secret and error vectors. Using as a PRF key and a counter, sample:

Each polynomial's coefficients are drawn from the centered binomial distribution with . To sample one coefficient: draw 4 bits and return . This avoids the need for Gaussian sampling (which is slow and hard to implement in constant time).

Step 4 — Compute the public key polynomial. Apply the NTT to and , then compute:

where are the NTT representations of . The product is a matrix-vector multiplication done coefficient-wise in NTT domain — pointwise polynomial multiplications for ML-KEM-768, each consisting of 256 modular multiplications.

Result. The public key is where is compressed by dropping 10 bits per coefficient (from 12-bit to 10-bit representations), giving 1,184 bytes total. The secret key retains in NTT form plus and a hash for the FO transform implicit rejection.

Encapsulation and Decapsulation — Detailed Steps

Encapsulation and decapsulation are mirror operations. The sender uses the public key to encrypt a random 32-byte value; the receiver uses the secret key to recover it. Here is the complete arithmetic.

Encapsulation

1. Sample message. Draw 32 random bytes . Hash with the public key hash: . The randomness seeds all sampling; is a candidate shared secret.

2. Sample ephemeral secret and errors. Using as PRF seed, sample and and from .

3. Compute ciphertext components.

v = hat{t}^ op hat{r}' + hat{e}_2 + leftlfloor rac{q}{2} ight ceil cdot m pmod{q}

The term encodes each bit of as either 0 or (rounded), so it sits in the middle of the coefficient range — far from both 0 and , giving maximum noise tolerance.

4. Compress. is compressed from 12-bit to 10-bit coefficients; from 12-bit to 4-bit. This compression is the main source of the (negligible) decryption failure probability. Total ciphertext: 1,088 bytes for ML-KEM-768.

5. Output. Ciphertext and shared secret (derived from and the ciphertext hash via SHAKE-256).

Decapsulation

1. Decompress and recover . Decompress and . Compute:

Substituting: this equals plus compression rounding errors. The noise terms are all small. Round each coefficient to 0 or to recover bit by bit.

2. Re-encapsulate. Run the full encapsulation algorithm using the recovered and the stored public key. Recompute .

3. Check ciphertext equality (implicit rejection). Compare the recomputed ciphertext to the received one in constant time. If they match, output (the shared secret). If they differ (indicating a tampered or malformed ciphertext), output a pseudorandom value derived from the secret key and the ciphertext hash. The caller cannot distinguish these cases without knowing the secret key — this is the FO transform's guarantee of CCA2 security.

Kyber in TLS — Hybrid Key Exchange

Kyber is not deployed alone in TLS. The standard approach is a hybrid key exchange: run both X25519 (classical) and ML-KEM-768 (post-quantum) in the same handshake, then combine their shared secrets cryptographically. This gives security against both a classical attacker today and a quantum attacker in the future.

The IETF standard for this is X25519MLKEM768 (draft-ietf-tls-hybrid-design). The TLS 1.3 key share extension carries both an X25519 public key (32 bytes) and an ML-KEM-768 public key (1,184 bytes). The server responds with both shares. Each side independently derives two shared secrets and combines them:

The concatenation-then-HKDF approach ensures that if either component is secure, the combined secret is secure. An attacker who breaks X25519 (e.g., with a quantum computer) still cannot break ML-KEM-768; an attacker who breaks ML-KEM-768 (e.g., due to an unforeseen classical attack) still cannot break X25519.

In Chrome's implementation (shipped in Chrome 116, enabled by default for all users), the combined key share adds roughly 1.1 KB to the TLS ClientHello. Benchmarks show this adds less than 1 ms of latency on a typical connection — dominated by network round-trip time, not by the cryptography.

Cloudflare's servers support X25519MLKEM768 across all their infrastructure. AWS Certificate Manager and AWS KMS both expose ML-KEM options in their APIs. OpenSSL 3.5 includes ML-KEM-512, ML-KEM-768, and ML-KEM-1024 as named groups in the default build.

Code Example — Using ML-KEM in Practice

The following pseudocode shows the API surface of ML-KEM-768 as exposed by two real production libraries: liboqs (C, via Open Quantum Safe project) and pqcrypto (Rust). These are the libraries you would actually call in a production system today.

liboqs (C)

#include <oqs/oqs.h>

// Key generation
OQS_KEM *kem = OQS_KEM_new(OQS_KEM_alg_ml_kem_768);

uint8_t *pk = malloc(kem->length_public_key);   // 1184 bytes
uint8_t *sk = malloc(kem->length_secret_key);   // 2400 bytes
OQS_KEM_keypair(kem, pk, sk);

// Encapsulation (sender side)
uint8_t *ct        = malloc(kem->length_ciphertext);    // 1088 bytes
uint8_t *ss_sender = malloc(kem->length_shared_secret); // 32 bytes
OQS_KEM_encaps(kem, ct, ss_sender, pk);

// Decapsulation (receiver side)
uint8_t *ss_recv = malloc(kem->length_shared_secret);   // 32 bytes
OQS_KEM_decaps(kem, ss_recv, ct, sk);

// ss_sender == ss_recv: use as AES-256-GCM key
assert(memcmp(ss_sender, ss_recv, 32) == 0);

OQS_KEM_free(kem);

pqcrypto (Rust)

use pqcrypto_mlkem::mlkem768;
use pqcrypto_traits::kem::{PublicKey, SecretKey, SharedSecret, Ciphertext};

// Key generation
let (pk, sk) = mlkem768::keypair();
// pk: 1184 bytes, sk: 2400 bytes

// Encapsulation (sender)
let (ss_sender, ct) = mlkem768::encapsulate(&pk);
// ct: 1088 bytes, ss_sender: 32 bytes

// Decapsulation (receiver)
let ss_recv = mlkem768::decapsulate(&ct, &sk);
// ss_recv: 32 bytes

assert_eq!(ss_sender.as_bytes(), ss_recv.as_bytes());
// Now use ss_sender as a key for AES-256-GCM or ChaCha20-Poly1305

For hybrid TLS with Go (using the standard library's experimental PQ support):

// Go 1.23+ includes X25519MLKEM768 in crypto/tls
// It is enabled automatically when CurvePreferences is not set.

cfg := &tls.Config{
    // X25519MLKEM768 is negotiated first when the client supports it.
    // No explicit configuration needed in Go 1.23+.
    MinVersion: tls.VersionTLS13,
}

// To verify which key exchange was used:
conn.ConnectionState().TLSUnique // or inspect via crypto/tls internals
// Chrome DevTools > Security panel shows "X25519MLKEM768" when negotiated.

Key library references for production use: