4.2 BKZ Algorithm
Block Korkine–Zolotarev. The practical lattice reducer that sets the cryptographic security baseline.
BKZ (Block Korkine–Zolotarev), introduced by Schnorr and Euchner in 1994, is the dominant lattice basis reduction algorithm in both theory and practice. Where LLL finds a -approximate shortest vector in polynomial time, BKZ trades exponential time for exponentially better output quality — parameterized by a block size .
BKZ is the algorithm that cryptographers use to estimate the hardness of Kyber, Dilithium, NTRU, and all other lattice-based cryptosystems. Setting secure parameters means choosing dimensions where BKZ cannot terminate before the heat death of the universe.
Beginner's Intuition 💡
If LLL is a basic GPS that only looks at two streets at a time to find shortcuts, BKZ is a much more powerful navigation system.
Instead of just looking at two streets, BKZ looks at whole neighborhoods (called "blocks") at once, finding the absolute best, mathematically perfect path through that entire neighborhood before moving on. By stitching these perfect neighborhood paths together, it creates an incredibly efficient overall route.
Cryptographers constantly test their defenses against BKZ, ensuring their mathematical "mazes" are so massive that even peering at 100 streets at once isn't enough to find a way through.
The Core Idea
BKZ- applies LLL to improve the full basis, then calls an exact SVPsolver on every consecutive -dimensional projected sublattice. The SVP solution is inserted back into the basis, and the process repeats until convergence.
Think of it as LLL with a much stronger local improvement step. LLL only swaps adjacent pairs; BKZ restructures full -dimensional blocks.
Projected Sublattices
For index , define the local projected sublattice as the orthogonal projection of onto the orthogonal complement of :
BKZ finds the shortest vector in and uses it to update the basis. After an SVP call on each block, BKZ performs another LLL pass. This is one BKZ tour. The algorithm runs many tours until the basis stabilizes.
Output Quality
The first basis vector after BKZ- satisfies:
where is the Hermite constant of dimension . The Hermite factor of BKZ- is approximately:
For , BKZ reduces to LLL. As , BKZ approaches the Hermite–Korkin–Zolotarev (HKZ) reduced basis, which contains an exact shortest vector.
Complexity
Each BKZ tour makes calls to an SVP oracle in dimension . The cost of exact SVP in dimension is roughly:
using the best known sieving algorithms. The number of BKZ tours needed for convergence is in practice (though difficult to bound theoretically). Total complexity is therefore roughly:
BKZ 2.0
BKZ 2.0 (Chen and Nguyen, 2011) substantially improved the practical performance of BKZ with three innovations:
Preprocessing: each local SVP call is preprocessed by running BKZ with smaller block size on the local block first. This dramatically reduces the cost of the SVP oracle.
Pruned enumeration: instead of exact SVP via full enumeration, BKZ 2.0 uses pruned enumeration with an optimized pruning polynomial. This allows much larger block sizes in practice — – become practical on a laptop.
Early termination: the algorithm tracks the Gram–Schmidt profile and stops each tour once the profile has stabilized, avoiding wasted computation.
The GSA and Security Estimation
The Geometric Series Assumption (GSA) predicts that after BKZ-, the Gram–Schmidt norms decay geometrically:
This model underlies nearly all lattice security estimates. Tools like the Lattice Estimator use the GSA to predict at what block size BKZ would find a decryption error or a signing key, then equate that to a bit security level via the SVP cost formula.
For Kyber-512, the BKZ block size needed for a key recovery attack is approximately , corresponding to roughly operations — the source of its claimed 128-bit post-quantum security.
The BKZ Shape and Gram–Schmidt Profile
After running BKZ- for several tours, the Gram–Schmidt norms plotted against index form a characteristic BKZ shape. Understanding this shape is essential for predicting attack costs and for using the BKZ simulator.
In the Geometric Series Assumption (GSA) regime, the profile is a straight line with slope :
In reality the profile deviates from the GSA: the first few and last few vectors are shorter than predicted (the "head" and "tail" effects). The head effect arises because the first SVP call is unrestricted; the tail effect because the last block wraps around. The BKZ simulator (Chen–Nguyen, 2011) models these deviations using a probabilistic model of SVP solutions in each block.
Simulating BKZ
The BKZ simulator predicts the Gram–Schmidt profile after a given number of BKZ tours without actually running BKZ. It is the engine behind the lattice-estimator (Albrecht et al.) used to generate NIST security estimates.
The simulator works as follows. Let . After processing block in one tour, the simulator updates:
where is the Hermite Gaussian heuristic for dimension . The update reflects that the SVP oracle finds a vector of length equal to the Gaussian heuristic applied to the projected sublattice. The simulator then sweeps through all blocks for each tour, producing the predicted profile.
In practice the simulator matches experimentally observed BKZ profiles to within 1–2% for . It allows security estimators to predict the exact block size at which BKZ would distinguish an LWE sample from uniform — thedecryption failure distinguisher threshold — without months of computation.
Python Implementation from Scratch
The following is a complete Python implementation of BKZ reduction. It builds on the LLL algorithm by adding an exact SVP oracle call on each projected sublattice block. The SVP oracle uses enumeration — exhaustive search over all short vectors in the projected block.
import numpy as np
def gram_schmidt(B):
"""Compute Gram-Schmidt orthogonalization and coefficients."""
n = B.shape[0]
B_star = np.zeros_like(B, dtype=float)
mu = np.zeros((n, n), dtype=float)
for i in range(n):
B_star[i] = B[i].copy()
for j in range(i):
mu[i, j] = np.dot(B[i], B_star[j]) / np.dot(B_star[j], B_star[j])
B_star[i] -= mu[i, j] * B_star[j]
return B_star, mu
def lll_reduction(B, delta=0.99):
"""LLL reduction (used as a subroutine inside BKZ)."""
B = np.array(B, dtype=float)
n = B.shape[0]
B_star, mu = gram_schmidt(B)
k = 1
while k < n:
for j in range(k - 1, -1, -1):
if abs(mu[k, j]) > 0.5:
B[k] -= round(mu[k, j]) * B[j]
B_star, mu = gram_schmidt(B)
norm_k = np.dot(B_star[k], B_star[k])
norm_k1 = np.dot(B_star[k - 1], B_star[k - 1])
if norm_k + mu[k, k - 1] ** 2 * norm_k1 >= delta * norm_k1:
k += 1
else:
B[[k, k - 1]] = B[[k - 1, k]]
B_star, mu = gram_schmidt(B)
k = max(k - 1, 1)
return B
def project_sublattice(B_star, mu, i, beta, n):
"""
Compute the projected sublattice basis for block [i, i+beta).
The projection pi_i maps vectors onto the orthogonal complement
of span(b_1, ..., b_{i-1}). In the Gram-Schmidt basis, this is
simply extracting the relevant coefficients.
Returns:
local_basis: beta x beta matrix representing the projected block
"""
block_end = min(i + beta, n)
block_size = block_end - i
# Build the local basis in the projected Gram-Schmidt coordinates
local = np.zeros((block_size, block_size), dtype=float)
for k in range(block_size):
# Diagonal: the Gram-Schmidt norm ||b*_{i+k}||
local[k, k] = np.linalg.norm(B_star[i + k])
# Off-diagonal: mu coefficients within the block
for j in range(k):
local[k, j] = mu[i + k, i + j] * np.linalg.norm(B_star[i + j])
return local, block_size
def enumerate_svp(local_basis, block_size):
"""
Solve SVP exactly on a small projected block via enumeration.
Uses depth-first search over integer coefficient vectors,
pruning branches where the partial norm exceeds the current best.
This is practical for block_size up to ~25-30.
Returns:
coeffs: integer coefficient vector of the shortest vector
"""
B_star_local, _ = gram_schmidt(local_basis)
norms_sq = np.array([np.dot(B_star_local[j], B_star_local[j])
for j in range(block_size)])
# Start with the first basis vector as the current shortest
best_norm_sq = np.dot(local_basis[0], local_basis[0])
best_coeffs = np.zeros(block_size, dtype=int)
best_coeffs[0] = 1
def _enumerate(level, partial_norm_sq, coeffs):
nonlocal best_norm_sq, best_coeffs
if level < 0:
if partial_norm_sq > 0 and partial_norm_sq < best_norm_sq:
best_norm_sq = partial_norm_sq
best_coeffs = coeffs.copy()
return
# Compute the center and radius for this level
center = 0.0
_, mu_local = gram_schmidt(local_basis)
for j in range(level + 1, block_size):
center -= coeffs[j] * mu_local[j, level]
# Search radius: how far from center can we go?
remaining = best_norm_sq - partial_norm_sq
if remaining <= 0:
return
bound = int(np.sqrt(remaining / norms_sq[level]) + abs(center)) + 1
for x in range(-bound, bound + 1):
coeffs[level] = x
new_partial = partial_norm_sq + (x + center) ** 2 * norms_sq[level]
if new_partial < best_norm_sq:
_enumerate(level - 1, new_partial, coeffs)
coeffs = np.zeros(block_size, dtype=int)
_enumerate(block_size - 1, 0.0, coeffs)
return best_coeffs
def bkz_reduction(B, beta=10, max_tours=10):
"""
BKZ lattice basis reduction.
Args:
B: integer matrix where rows are basis vectors (n x m)
beta: block size. Controls quality vs. cost tradeoff.
beta=2 is equivalent to LLL.
Larger beta → better reduction but exponential cost.
max_tours: maximum number of BKZ tours (full sweeps).
Returns:
B: BKZ-reduced basis
"""
B = np.array(B, dtype=float)
n = B.shape[0]
# Start with an LLL-reduced basis
B = lll_reduction(B)
for tour in range(max_tours):
improved = False
# One BKZ tour: sweep through all blocks [i, i+beta)
for i in range(n - 1):
B_star, mu = gram_schmidt(B)
block_end = min(i + beta, n)
block_size = block_end - i
if block_size < 2:
continue
# === Step 1: Project onto the local sublattice ===
local_basis, bs = project_sublattice(B_star, mu, i, beta, n)
# === Step 2: Solve SVP on the projected block ===
coeffs = enumerate_svp(local_basis, bs)
# === Step 3: Insert the short vector back ===
# Lift the SVP solution from projected coordinates to full basis
new_vec = np.zeros_like(B[0], dtype=float)
for k in range(bs):
new_vec += coeffs[k] * B[i + k]
# Check if the new vector is actually shorter
if np.linalg.norm(new_vec) > 0 and \
np.linalg.norm(new_vec) < np.linalg.norm(B_star[i]) * 0.999:
# Insert the new vector at position i
# Shift other vectors down and place the short one first
old_block = B[i:block_end].copy()
B[i] = new_vec
# Re-fill remaining positions with old vectors
idx = 0
for k in range(1, block_size):
if idx < block_size:
B[i + k] = old_block[idx]
idx += 1
# Re-run LLL on the modified basis to maintain structure
B = lll_reduction(B)
improved = True
if not improved:
# Basis has converged — no block improved this tour
break
return B
# === Example usage ===
basis = np.array([
[1, 1, 1, 1],
[-1, 0, 0, 0],
[3, 5, 6, 6],
[1, 2, 3, 7],
])
print("Original basis:")
print(basis)
print("Norms:", [round(np.linalg.norm(row), 2) for row in basis])
# BKZ with block size 3 (stronger than LLL)
reduced = bkz_reduction(basis, beta=3, max_tours=5)
print("\nBKZ-reduced basis (beta=3):")
print(reduced.astype(int))
print("Norms:", [round(np.linalg.norm(row), 2) for row in reduced])How BKZ differs from LLL — step by step:
1. Start with LLL. BKZ always begins by running LLL on the full basis. This gives a reasonable starting point and ensures the basis is already size-reduced.
2. Sweep through blocks. For each index , BKZ extracts the projected sublattice spanned by projected orthogonally to . This projection isolates the "local" geometry of the block.
3. Solve SVP exactly. On each -dimensional projected block, BKZ calls an exact shortest vector solver. The implementation above uses enumeration — a depth-first tree search that prunes branches whenever the partial norm exceeds the current best. This is exponential in , which is why block size controls the time/quality tradeoff.
4. Insert and re-reduce. If the SVP solution is shorter than the current first vector of the block, it is inserted back into the full basis and LLL is re-run to restore global structure. This one insertion can cascade improvements through the entire basis.
5. Repeat tours. One full sweep through all blocks is a BKZ tour. The algorithm runs multiple tours until no block improves — meaning the basis has converged to a BKZ- reduced form.
Practical limits: this pure-Python implementation works for small examples (dimension ≤ 20, block size ≤ 15). For real cryptographic lattices, use FPLLL/fpylll which implements BKZ 2.0 with pruned enumeration, preprocessing, and floating-point acceleration.
BKZ in Practice: fpylll
For production lattice reduction, use the fpylll Python wrapper around FPLLL. It implements BKZ 2.0 with all modern optimizations:
from fpylll import IntegerMatrix, BKZ, LLL
from fpylll.algorithms.bkz2 import BKZReduction
# Create a random 50-dimensional lattice
A = IntegerMatrix.random(50, "qary", k=25, bits=30)
# First run LLL as a warm-up
LLL.reduction(A, delta=0.99)
print("After LLL, first vector norm:", A[0].norm())
# Configure BKZ 2.0 parameters
flags = BKZ.AUTO_ABORT | BKZ.MAX_LOOPS # auto-terminate + limit tours
params = BKZ.Param(
block_size=20, # beta: tradeoff quality vs. time
max_loops=8, # maximum number of BKZ tours
strategies=BKZ.DEFAULT_STRATEGY, # pruning strategies
flags=flags,
)
# Run BKZ 2.0 reduction
bkz = BKZReduction(A)
bkz(params)
print("After BKZ-20, first vector norm:", A[0].norm())
# Progressive reduction: increase block size for better output
for beta in [25, 30, 35, 40]:
params = BKZ.Param(block_size=beta, max_loops=4, flags=flags)
bkz(params)
print(f"After BKZ-{beta}, first vector norm: {A[0].norm():.2f}")Progressive reduction is a common technique: run BKZ with increasing block sizes . Each stage preprocesses the basis for the next, making larger block sizes converge much faster than starting cold. This mirrors how BKZ 2.0's internal preprocessing works — using smaller BKZ to warm up SVP calls at larger block sizes.
Extreme Pruning
Pruned enumeration (Schnorr–Gama–Nguyen, 2010; Aono–Wang–Hayashi–Takagi, 2016) replaces the exact SVP oracle with a randomized search that succeeds with probability per call, but runs in time proportional to — which can be much smaller than the exact enumeration cost when is chosen optimally.
Extreme pruning pushes very small (e.g., ) and runs many independent repetitions. The total cost is:
The pruning function defines a bounding hyperellipsoid at each level of the enumeration tree. At level , the partial norm must satisfy . The optimal pruning polynomial is found numerically by minimizing total cost subject to a target success probability.
With extreme pruning, BKZ becomes practical at block sizes up to – on a standard laptop, and up to on a large compute cluster — which is why it is the inner loop of g6k and why NIST security levels are set as high as they are.