BKZ — Block Korkine-Zolotarev Reduction

BKZ (Block Korkine–Zolotarev), introduced by Schnorr and Euchner in 1994, is the dominant lattice basis reduction algorithm in both theory and practice. Where LLL finds a $2^{n /2}$ -approximate shortest vector in polynomial time, BKZ trades exponential time for exponentially better output quality — parameterized by a block size $β$ .

BKZ is the algorithm that cryptographers use to estimate the hardness of Kyber, Dilithium, NTRU, and all other lattice-based cryptosystems. Setting secure parameters means choosing dimensions where BKZ cannot terminate before the heat death of the universe.

Beginner's Intuition 💡

If LLL is a basic GPS that only looks at two streets at a time to find shortcuts, BKZ is a much more powerful navigation system.

Instead of just looking at two streets, BKZ looks at whole neighborhoods (called "blocks") at once, finding the absolute best, mathematically perfect path through that entire neighborhood before moving on. By stitching these perfect neighborhood paths together, it creates an incredibly efficient overall route.

Cryptographers constantly test their defenses against BKZ, ensuring their mathematical "mazes" are so massive that even peering at 100 streets at once isn't enough to find a way through.

The Core Idea

BKZ- $β$ applies LLL to improve the full basis, then calls an exact SVPsolver on every consecutive $β$ -dimensional projected sublattice. The SVP solution is inserted back into the basis, and the process repeats until convergence.

Think of it as LLL with a much stronger local improvement step. LLL only swaps adjacent pairs; BKZ restructures full $β$ -dimensional blocks.

Projected Sublattices

For index $i$ , define the local projected sublattice as the orthogonal projection of $b_{i}, \dots, b_{m i n (i + β - 1, n)}$ onto the orthogonal complement of $span (b_{1}, \dots, b_{i - 1})$ :

π_{i} (Λ) = π_{i} (b_{i}), π_{i} (b_{i + 1}), \dots, π_{i} (b_{i + β - 1})

BKZ finds the shortest vector $v$ in $π_{i} (Λ)$ and uses it to update the basis. After an SVP call on each block, BKZ performs another LLL pass. This is one BKZ tour. The algorithm runs many tours until the basis stabilizes.

Output Quality

The first basis vector after BKZ- $β$ satisfies:

∥ b_{1} ∥ \leq γ_{β}^{(n - 1) / (β - 1)} \cdot vol (Λ)^{1/ n}

where $γ_{β}$ is the Hermite constant of dimension $β$ . The Hermite factor $δ_{0}$ of BKZ- $β$ is approximately:

δ_{0} \approx (\frac{β}{2 π e})^{1/ (2 (β - 1))}

For $β = 2$ , BKZ reduces to LLL. As $β \to n$ , BKZ approaches the Hermite–Korkin–Zolotarev (HKZ) reduced basis, which contains an exact shortest vector.

Complexity

Each BKZ tour makes $n - β + 1$ calls to an SVP oracle in dimension $β$ . The cost of exact SVP in dimension $β$ is roughly:

SVP cost \approx 2^{0.292 β + o (β)}

using the best known sieving algorithms. The number of BKZ tours needed for convergence is $O (n / β)$ in practice (though difficult to bound theoretically). Total complexity is therefore roughly:

O (\frac{n ^{2}}{β} \cdot 2^{0.292 β})

BKZ 2.0

BKZ 2.0 (Chen and Nguyen, 2011) substantially improved the practical performance of BKZ with three innovations:

Preprocessing: each local SVP call is preprocessed by running BKZ with smaller block size on the local block first. This dramatically reduces the cost of the SVP oracle.

Pruned enumeration: instead of exact SVP via full enumeration, BKZ 2.0 uses pruned enumeration with an optimized pruning polynomial. This allows much larger block sizes in practice — $β \approx 40$ – $60$ become practical on a laptop.

Early termination: the algorithm tracks the Gram–Schmidt profile and stops each tour once the profile has stabilized, avoiding wasted computation.

The GSA and Security Estimation

The Geometric Series Assumption (GSA) predicts that after BKZ- $β$ , the Gram–Schmidt norms decay geometrically:

∥ b_{i}^{*} ∥ \approx δ_{0}^{n + 1 - 2 i} \cdot vol (Λ)^{1/ n}

This model underlies nearly all lattice security estimates. Tools like the Lattice Estimator use the GSA to predict at what block size BKZ would find a decryption error or a signing key, then equate that to a bit security level via the SVP cost formula.

For Kyber-512, the BKZ block size needed for a key recovery attack is approximately $β \approx 140$ , corresponding to roughly $2^{118}$ operations — the source of its claimed 128-bit post-quantum security.

The BKZ Shape and Gram–Schmidt Profile

After running BKZ- $β$ for several tours, the Gram–Schmidt norms $lo g ∥ b_{i}^{*} ∥$ plotted against index $i$ form a characteristic BKZ shape. Understanding this shape is essential for predicting attack costs and for using the BKZ simulator.

In the Geometric Series Assumption (GSA) regime, the profile is a straight line with slope $\frac{1}{2} lo g δ_{0}^{- 2}$ :

lo g ∥ b_{i}^{*} ∥ = \frac{n + 1 - 2 i}{2} \cdot lo g δ_{0}^{- 2} + \frac{1}{n} lo g vol (Λ)

In reality the profile deviates from the GSA: the first few and last few vectors are shorter than predicted (the "head" and "tail" effects). The head effect arises because the first SVP call is unrestricted; the tail effect because the last block wraps around. The BKZ simulator (Chen–Nguyen, 2011) models these deviations using a probabilistic model of SVP solutions in each block.

Simulating BKZ

The BKZ simulator predicts the Gram–Schmidt profile after a given number of BKZ tours without actually running BKZ. It is the engine behind the lattice-estimator (Albrecht et al.) used to generate NIST security estimates.

The simulator works as follows. Let $r_{i} = lo g ∥ b_{i}^{*} ∥^{2}$ . After processing block $i$ in one tour, the simulator updates:

r_{i} \leftarrow min (r_{i}, \frac{1}{β} j = i \sum i + β - 1 r_{j} - lo g γ_{β})

where $γ_{β} \approx (\frac{β}{2 π e})^{1/ β} \cdot π β$ is the Hermite Gaussian heuristic for dimension $β$ . The update reflects that the SVP oracle finds a vector of length equal to the Gaussian heuristic applied to the projected sublattice. The simulator then sweeps through all blocks for each tour, producing the predicted profile.

In practice the simulator matches experimentally observed BKZ profiles to within 1–2% for $β \geq 30$ . It allows security estimators to predict the exact block size at which BKZ would distinguish an LWE sample from uniform — thedecryption failure distinguisher threshold — without months of computation.

Python Implementation from Scratch

The following is a complete Python implementation of BKZ reduction. It builds on the LLL algorithm by adding an exact SVP oracle call on each projected sublattice block. The SVP oracle uses enumeration — exhaustive search over all short vectors in the projected block.

import numpy as np

def gram_schmidt(B):
    """Compute Gram-Schmidt orthogonalization and coefficients."""
    n = B.shape[0]
    B_star = np.zeros_like(B, dtype=float)
    mu = np.zeros((n, n), dtype=float)
    for i in range(n):
        B_star[i] = B[i].copy()
        for j in range(i):
            mu[i, j] = np.dot(B[i], B_star[j]) / np.dot(B_star[j], B_star[j])
            B_star[i] -= mu[i, j] * B_star[j]
    return B_star, mu


def lll_reduction(B, delta=0.99):
    """LLL reduction (used as a subroutine inside BKZ)."""
    B = np.array(B, dtype=float)
    n = B.shape[0]
    B_star, mu = gram_schmidt(B)
    k = 1
    while k < n:
        for j in range(k - 1, -1, -1):
            if abs(mu[k, j]) > 0.5:
                B[k] -= round(mu[k, j]) * B[j]
                B_star, mu = gram_schmidt(B)
        norm_k = np.dot(B_star[k], B_star[k])
        norm_k1 = np.dot(B_star[k - 1], B_star[k - 1])
        if norm_k + mu[k, k - 1] ** 2 * norm_k1 >= delta * norm_k1:
            k += 1
        else:
            B[[k, k - 1]] = B[[k - 1, k]]
            B_star, mu = gram_schmidt(B)
            k = max(k - 1, 1)
    return B


def project_sublattice(B_star, mu, i, beta, n):
    """
    Compute the projected sublattice basis for block [i, i+beta).

    The projection pi_i maps vectors onto the orthogonal complement
    of span(b_1, ..., b_{i-1}). In the Gram-Schmidt basis, this is
    simply extracting the relevant coefficients.

    Returns:
        local_basis: beta x beta matrix representing the projected block
    """
    block_end = min(i + beta, n)
    block_size = block_end - i

    # Build the local basis in the projected Gram-Schmidt coordinates
    local = np.zeros((block_size, block_size), dtype=float)
    for k in range(block_size):
        # Diagonal: the Gram-Schmidt norm ||b*_{i+k}||
        local[k, k] = np.linalg.norm(B_star[i + k])
        # Off-diagonal: mu coefficients within the block
        for j in range(k):
            local[k, j] = mu[i + k, i + j] * np.linalg.norm(B_star[i + j])

    return local, block_size


def enumerate_svp(local_basis, block_size):
    """
    Solve SVP exactly on a small projected block via enumeration.

    Uses depth-first search over integer coefficient vectors,
    pruning branches where the partial norm exceeds the current best.
    This is practical for block_size up to ~25-30.

    Returns:
        coeffs: integer coefficient vector of the shortest vector
    """
    B_star_local, _ = gram_schmidt(local_basis)
    norms_sq = np.array([np.dot(B_star_local[j], B_star_local[j])
                         for j in range(block_size)])

    # Start with the first basis vector as the current shortest
    best_norm_sq = np.dot(local_basis[0], local_basis[0])
    best_coeffs = np.zeros(block_size, dtype=int)
    best_coeffs[0] = 1

    def _enumerate(level, partial_norm_sq, coeffs):
        nonlocal best_norm_sq, best_coeffs

        if level < 0:
            if partial_norm_sq > 0 and partial_norm_sq < best_norm_sq:
                best_norm_sq = partial_norm_sq
                best_coeffs = coeffs.copy()
            return

        # Compute the center and radius for this level
        center = 0.0
        _, mu_local = gram_schmidt(local_basis)
        for j in range(level + 1, block_size):
            center -= coeffs[j] * mu_local[j, level]

        # Search radius: how far from center can we go?
        remaining = best_norm_sq - partial_norm_sq
        if remaining <= 0:
            return
        bound = int(np.sqrt(remaining / norms_sq[level]) + abs(center)) + 1

        for x in range(-bound, bound + 1):
            coeffs[level] = x
            new_partial = partial_norm_sq + (x + center) ** 2 * norms_sq[level]
            if new_partial < best_norm_sq:
                _enumerate(level - 1, new_partial, coeffs)

    coeffs = np.zeros(block_size, dtype=int)
    _enumerate(block_size - 1, 0.0, coeffs)

    return best_coeffs


def bkz_reduction(B, beta=10, max_tours=10):
    """
    BKZ lattice basis reduction.

    Args:
        B: integer matrix where rows are basis vectors (n x m)
        beta: block size. Controls quality vs. cost tradeoff.
              beta=2 is equivalent to LLL.
              Larger beta → better reduction but exponential cost.
        max_tours: maximum number of BKZ tours (full sweeps).

    Returns:
        B: BKZ-reduced basis
    """
    B = np.array(B, dtype=float)
    n = B.shape[0]

    # Start with an LLL-reduced basis
    B = lll_reduction(B)

    for tour in range(max_tours):
        improved = False

        # One BKZ tour: sweep through all blocks [i, i+beta)
        for i in range(n - 1):
            B_star, mu = gram_schmidt(B)
            block_end = min(i + beta, n)
            block_size = block_end - i

            if block_size < 2:
                continue

            # === Step 1: Project onto the local sublattice ===
            local_basis, bs = project_sublattice(B_star, mu, i, beta, n)

            # === Step 2: Solve SVP on the projected block ===
            coeffs = enumerate_svp(local_basis, bs)

            # === Step 3: Insert the short vector back ===
            # Lift the SVP solution from projected coordinates to full basis
            new_vec = np.zeros_like(B[0], dtype=float)
            for k in range(bs):
                new_vec += coeffs[k] * B[i + k]

            # Check if the new vector is actually shorter
            if np.linalg.norm(new_vec) > 0 and \
               np.linalg.norm(new_vec) < np.linalg.norm(B_star[i]) * 0.999:
                # Insert the new vector at position i
                # Shift other vectors down and place the short one first
                old_block = B[i:block_end].copy()
                B[i] = new_vec
                # Re-fill remaining positions with old vectors
                idx = 0
                for k in range(1, block_size):
                    if idx < block_size:
                        B[i + k] = old_block[idx]
                        idx += 1
                # Re-run LLL on the modified basis to maintain structure
                B = lll_reduction(B)
                improved = True

        if not improved:
            # Basis has converged — no block improved this tour
            break

    return B


# === Example usage ===
basis = np.array([
    [1,  1,  1,  1],
    [-1, 0,  0,  0],
    [3,  5,  6,  6],
    [1,  2,  3,  7],
])

print("Original basis:")
print(basis)
print("Norms:", [round(np.linalg.norm(row), 2) for row in basis])

# BKZ with block size 3 (stronger than LLL)
reduced = bkz_reduction(basis, beta=3, max_tours=5)
print("\nBKZ-reduced basis (beta=3):")
print(reduced.astype(int))
print("Norms:", [round(np.linalg.norm(row), 2) for row in reduced])

How BKZ differs from LLL — step by step:

1. Start with LLL. BKZ always begins by running LLL on the full basis. This gives a reasonable starting point and ensures the basis is already size-reduced.

2. Sweep through blocks. For each index $i$ , BKZ extracts the projected sublattice spanned by $b_{i}, \dots, b_{i + β - 1}$ projected orthogonally to $span (b_{1}, \dots, b_{i - 1})$ . This projection isolates the "local" geometry of the block.

3. Solve SVP exactly. On each $β$ -dimensional projected block, BKZ calls an exact shortest vector solver. The implementation above uses enumeration — a depth-first tree search that prunes branches whenever the partial norm exceeds the current best. This is exponential in $β$ , which is why block size controls the time/quality tradeoff.

4. Insert and re-reduce. If the SVP solution is shorter than the current first vector of the block, it is inserted back into the full basis and LLL is re-run to restore global structure. This one insertion can cascade improvements through the entire basis.

5. Repeat tours. One full sweep through all blocks is a BKZ tour. The algorithm runs multiple tours until no block improves — meaning the basis has converged to a BKZ- $β$ reduced form.

Practical limits: this pure-Python implementation works for small examples (dimension ≤ 20, block size ≤ 15). For real cryptographic lattices, use FPLLL/fpylll which implements BKZ 2.0 with pruned enumeration, preprocessing, and floating-point acceleration.

BKZ in Practice: fpylll

For production lattice reduction, use the fpylll Python wrapper around FPLLL. It implements BKZ 2.0 with all modern optimizations:

from fpylll import IntegerMatrix, BKZ, LLL
from fpylll.algorithms.bkz2 import BKZReduction

# Create a random 50-dimensional lattice
A = IntegerMatrix.random(50, "qary", k=25, bits=30)

# First run LLL as a warm-up
LLL.reduction(A, delta=0.99)
print("After LLL, first vector norm:", A[0].norm())

# Configure BKZ 2.0 parameters
flags = BKZ.AUTO_ABORT | BKZ.MAX_LOOPS  # auto-terminate + limit tours
params = BKZ.Param(
    block_size=20,          # beta: tradeoff quality vs. time
    max_loops=8,            # maximum number of BKZ tours
    strategies=BKZ.DEFAULT_STRATEGY,  # pruning strategies
    flags=flags,
)

# Run BKZ 2.0 reduction
bkz = BKZReduction(A)
bkz(params)

print("After BKZ-20, first vector norm:", A[0].norm())

# Progressive reduction: increase block size for better output
for beta in [25, 30, 35, 40]:
    params = BKZ.Param(block_size=beta, max_loops=4, flags=flags)
    bkz(params)
    print(f"After BKZ-{beta}, first vector norm: {A[0].norm():.2f}")

Progressive reduction is a common technique: run BKZ with increasing block sizes $β = 20, 25, 30, \dots$ . Each stage preprocesses the basis for the next, making larger block sizes converge much faster than starting cold. This mirrors how BKZ 2.0's internal preprocessing works — using smaller BKZ to warm up SVP calls at larger block sizes.

Extreme Pruning

Pruned enumeration (Schnorr–Gama–Nguyen, 2010; Aono–Wang–Hayashi–Takagi, 2016) replaces the exact SVP oracle with a randomized search that succeeds with probability $p < 1$ per call, but runs in time proportional to $T_{e x t S V P} / p$ — which can be much smaller than the exact enumeration cost when $p$ is chosen optimally.

Extreme pruning pushes $p$ very small (e.g., $p \approx 2^{- 40}$ ) and runs many independent repetitions. The total cost is:

T_{total} = \frac{1}{p} \cdot T_{pruned}

The pruning function $R_{1} \geq R_{2} \geq \dots \geq R_{β}$ defines a bounding hyperellipsoid at each level of the enumeration tree. At level $k$ , the partial norm must satisfy $\sum_{i = k}^{β} x_{i}^{2} ∥ b_{i}^{*} ∥^{2} \leq R_{k}^{2}$ . The optimal pruning polynomial is found numerically by minimizing total cost subject to a target success probability.

With extreme pruning, BKZ becomes practical at block sizes up to $β \approx 60$ – $80$ on a standard laptop, and up to $β \approx 180$ on a large compute cluster — which is why it is the inner loop of g6k and why NIST security levels are set as high as they are.