Fully Homomorphic Encryption (FHE) with Lattices

Encryption is usually a one-way door. You lock data, send it to a server, and the server stores it blindly until you unlock it. If you want the server to actually do something with the data — search it, run a model on it, add two numbers — you have to decrypt it first, exposing it.

Fully Homomorphic Encryption (FHE) breaks this assumption entirely. With FHE, a server can run arbitrary programs on your encrypted data — adding, multiplying, sorting, evaluating neural networks — and produce an encrypted result. You decrypt only the final answer. The server never sees your data, not even for a microsecond.

This was considered a "holy grail" of cryptography for 30 years. Craig Gentry showed it was possible in his 2009 Stanford dissertation (cited over 9,500 times) using ideal lattice mathematics. All practical FHE schemes today are built on LWE and Ring-LWE.

Beginner's Intuition 💡

Imagine you have a priceless raw diamond that you want a master jeweler to cut, but you don't trust the jeweler not to steal it. Fully Homomorphic Encryption (FHE) is like putting the diamond inside an unbreakable, locked glovebox with built-in remote-control tools.

You give the locked box to the jeweler. The jeweler can use the tools to perfectly cut and polish the diamond, but they can never actually open the box, touch the diamond, or even see exactly what it looks like. When they return the box, only you have the key to open it and take out the finished gem. FHE allows cloud servers to process and analyze your data perfectly—without ever decrypting or "seeing" it!

What "Homomorphic" Means

The word homomorphic means "structure-preserving." An encryption scheme is homomorphic if performing operations on ciphertexts corresponds to performing operations on the underlying plaintexts. Concretely: if $c_1 = \mathrm{Enc}(m_1)$ and $c_2 = \mathrm{Enc}(m_2)$ , then there exists an operation $c_1 \oplus c_2$ such that $\mathrm{Dec}(c_1 \oplus c_2) = m_1 + m_2$ — without ever decrypting.

Definition 18 — Fully Homomorphic Encryption

An FHE scheme supports any computable function on ciphertexts. Given encryptions of inputs $m_1, \dots, m_k$ and any function $$f$$ , an FHE scheme can produce an encryption of $f(m_1, \dots, m_k)$ without decrypting anything:

$\mathrm{Enc}(f(m_1, \dots, m_k)) = \mathrm{Eval}(f,\; \mathrm{Enc}(m_1),\; \dots,\; \mathrm{Enc}(m_k))$

The Eval algorithm operates only on ciphertexts and the description of $$f$$ . The secret key is needed only to decrypt the final output.

The Noise Problem

FHE is built on LWE (see SVP for hardness), which means ciphertexts carry random noise. Each arithmetic operation grows that noise. Think of the noise as sand accumulating in a filter: a little is fine; too much and you can no longer see through it. After enough operations — especially multiplications — the noise exceeds the decryption threshold and the ciphertext becomes unreadable.

This is why early LWE schemes supported only a bounded number of operations: "somewhat homomorphic encryption" (SHE). To get to fully homomorphic encryption, you need to be able to reset the noise level without decrypting. Gentry's 2009 insight was that you can.

The Bootstrapping Trick

Bootstrapping is the operation that makes FHE "full." The idea: take a noisy ciphertext and homomorphically evaluate the decryption circuit on it, using an encrypted copy of the secret key.

Think of it as: "decrypt-then-re-encrypt, but done entirely inside the encrypted world." The output is a fresh ciphertext encrypting the same value, but with fresh, small noise. You've refreshed the ciphertext without learning what it contains.

This works as long as the scheme can homomorphically evaluate its own decryption circuit — a condition called "bootstrappability." Once you have bootstrapping, you can chain an unlimited number of operations.

Three Flavors of FHE

Different FHE schemes make different tradeoffs. The right choice depends on what kind of computation you need to run on encrypted data.

BFV / BGV — Exact integer arithmetic

Encrypt integers. Add and multiply them exactly, modulo a plaintext modulus.

Best for: database queries, counting, comparisons, anything that needs exact integer results. Supports SIMD batching: encode thousands of integers into one ciphertext and operate on all of them in parallel using the Chinese Remainder Theorem.

CKKS — Approximate real arithmetic

Encrypt real (floating-point) numbers. Results are approximate, with controlled precision loss.

Best for: machine learning inference, statistics, signal processing. The approximation is usually acceptable — a neural network doesn't need exact arithmetic. Fastest FHE scheme for practical ML workloads.

TFHE / FHEW — Boolean gates

Encrypt single bits. Evaluate any Boolean circuit, one gate at a time. Each gate takes a few milliseconds but includes a fast bootstrapping step.

Best for: arbitrary programs, branching, ReLU, comparisons, sorting — operations that are hard to express as polynomials. Ideal when you need to run code you can't modify for FHE.

CKKS in Depth: FHE for Machine Learning

CKKS (Cheon-Kim-Kim-Song, 2017) is the most widely used FHE scheme for real-world applications. It encodes a vector of real numbers as a polynomial and allows approximate additions and multiplications. Here is how the encoding works:

How CKKS Encodes Real Numbers

1. Start with a vector of real numbers $z = (z_1, \dots, z_{n/2})$ .

2. Apply the inverse canonical embedding to map $$z$$ to a polynomial $$m(x)$$ .

3. Scale up: multiply by a large scaling factor $\Delta$ (typically $2^{40}$ to $2^{60}$ ) and round to integers, getting $p(x) = \lfloor \Delta \cdot m(x) \rceil$ .

4. Encrypt $$p(x)$$ as a Ring-LWE ciphertext.

After homomorphic operations, decoding reverses steps 2–4. The scaling factor controls the precision: with $\Delta = 2^{40}$ , you get about 12 decimal digits of precision — better than 32-bit float.

Noise Growth and the Modulus Chain

Each multiplication in CKKS doubles the noise and consumes one "level" of the ciphertext modulus. Before starting, you choose a chain of moduli $q_L > q_{L-1} > \cdots > q_0$ . Each multiplication uses the rescale operation (dividing the ciphertext modulus by one step in the chain) to control noise. You get at most $$L$$ levels of multiplications before you either bootstrap or stop.

Choosing Parameters for a CKKS Computation

Say you want to evaluate a depth-10 neural network (10 layers of multiplications) with 40-bit precision:

— You need 10 "levels," each consuming about 40 bits of modulus. — Add one special modulus for the initial ciphertext: ~40 bits. — Total modulus: $\log_2 q \approx 11 \times 40 = 440$ bits. — To achieve 128-bit security at this modulus size, you need a ring dimension $n = 2^{14} = 16{,}384$ .

Each ciphertext encrypts a vector of $$n/2 = 8{,}192$$ real numbers in parallel. The SIMD nature of CKKS means you amortize computation over thousands of values simultaneously.

Real-World Applications

Private ML inference

A hospital sends encrypted patient data to a cloud service running a diagnostic model. The model runs on the encrypted data and returns an encrypted prediction. The hospital decrypts the result. The cloud service — and anyone who compromised it — learns nothing about the patient. CKKS makes this viable: matrix multiplications, convolutions, and polynomial activations all map naturally to CKKS operations.

Private information retrieval

You want to query a database without revealing which record you're looking for. With BFV/BGV, you can encrypt an index, send it to the server, and receive an encrypted result — the server returns the right data without knowing which row you queried. Used in DNS privacy, genomic queries, and secure voting.

Genomics and healthcare

Genome-wide association studies (GWAS) require sharing genomic data across institutions — a massive privacy risk. FHE allows researchers to run statistical analyses on encrypted genomic data from multiple hospitals without any hospital exposing its patients' raw sequences. iDASH competitions have demonstrated FHE-based GWAS and logistic regression in practice.

Confidential smart contracts

Blockchain transactions are fully public. FHE enables smart contracts where inputs remain encrypted: blind auctions (bids are private), private voting, DeFi strategies that don't reveal trading intent. Projects like Fhenix and Zama's fhEVM are building production systems on TFHE.

Performance and the Road Ahead

FHE is still slower than plaintext computation by 3–6 orders of magnitude. A multiplication that takes nanoseconds on plaintext data takes milliseconds on ciphertexts. Bootstrapping — the operation that refreshes noise — takes seconds on CPU. This makes FHE practical today only for specific applications where the privacy guarantee is worth the cost.

Where performance stands in 2025

CKKS multiplication (CPU): ~5 ms per multiplication, ~8,000 values in parallel.
CKKS bootstrapping (CPU): ~2–10 seconds. On GPU (H100): ~200 ms.
TFHE gate (CPU): ~1–10 ms per Boolean gate, now with fast programmable bootstrapping.

Hardware acceleration is the active frontier. Purpose-built FHE chips (from companies like Cornami, Optalysys, and research groups at MIT and Stanford) target 1000× speedups over CPU. The consensus in the field: within 5–10 years, FHE will be fast enough for real-time applications.

Libraries to Get Started

OpenFHE

Open-source C++ library. Supports BFV, BGV, CKKS, TFHE, FHEW. The most complete open-source FHE library — NIST-standard compliant, actively maintained by Duality Technologies and the research community. Python bindings available.

Microsoft SEAL

C++ library by Microsoft Research. Supports BFV and CKKS. Excellent documentation, widely used in industry. Has Python bindings (PySEAL) and .NET wrappers. Used internally in Microsoft Azure Confidential Computing offerings.

TFHE-rs / Concrete

Rust library by Zama. Implements TFHE with fast programmable bootstrapping. Has a high-level compiler that takes regular Rust code and compiles it to run on encrypted data. Powers Zama's fhEVM for confidential Ethereum smart contracts.

Lattigo

Go library implementing CKKS and BFV. Clean API, good documentation. Well-suited for building FHE applications in Go-based server environments. Used in several healthcare and financial privacy projects.

The Four Generations of FHE

FHE did not arrive fully formed. It took a sequence of conceptual breakthroughs — each one removing a different obstacle — to get from Gentry's original 2009 construction to the fast, practical schemes deployed today.

Generation 1 — Gentry 2009: Proof of Concept

Craig Gentry's 2009 Stanford dissertation gave the first construction of a fully homomorphic encryption scheme. It was built on ideal lattices (polynomial rings) and used a "squashing" technique to reduce the decryption circuit's depth enough to allow bootstrapping. The scheme was exponentially slower than any practical use case — a single bootstrapping step took hours — but it answered the 30-year-old open question: yes, FHE is possible. The key insight was that an encryption scheme is fully homomorphic if and only if it can homomorphically evaluate its own decryption circuit (bootstrappability). Every FHE scheme since has used this principle.

Generation 2 — BV/BGV 2011–2012: LWE and Modulus Switching

Brakerski and Vaikuntanathan (BV, 2011) rebuilt FHE on the Learning With Errors (LWE) problem and Ring-LWE, which have much stronger worst-case hardness guarantees than the ideal lattice assumptions Gentry used. Brakerski, Gentry, and Vaikuntanathan (BGV, 2012) introduced modulus switching: instead of bootstrapping after every multiplication, you can reduce the ciphertext modulus to control noise growth, effectively deferring bootstrapping until many operations have been performed. This gave rise to leveled FHE — schemes that support a predetermined number of multiplications without bootstrapping. BGV is still one of the most widely used schemes for exact integer arithmetic. The BFV scheme (Fan-Vercauteren, 2012) is a variant of BGV that is often easier to implement and is the basis of Microsoft SEAL.

Generation 3 — GSW 2013: Bootstrapping for Boolean Circuits

Gentry, Sahai, and Waters (GSW, 2013) introduced a fundamentally different approach to homomorphic evaluation: represent ciphertexts as matrices and operations as matrix products. This eliminated the need for key-switching (a technically complex operation needed in BV/BGV) and simplified the noise analysis dramatically. GSW is the theoretical foundation for TFHE and FHEW — schemes that bootstrap after every Boolean gate. The GSW-based approach makes bootstrapping fast enough (under 100 ms in modern implementations) that it can be performed after every single gate, enabling the evaluation of arbitrary programs on encrypted data without any depth restriction. The trade-off: GSW evaluates one gate at a time, so it is slower than BGV/CKKS for arithmetic-heavy computations.

Generation 4 — CKKS 2017: Approximate Arithmetic for Machine Learning

Cheon, Kim, Kim, and Song (CKKS, 2017) made the observation that many real-world computations — machine learning inference, statistics, signal processing — do not require exact arithmetic. They are inherently approximate. CKKS reframes the noise in LWE ciphertexts not as an error to be managed, but as a feature: the noise naturally provides the rounding that floating-point arithmetic already performs. By encoding real numbers with a large scaling factor and treating the LWE noise as a small approximation error, CKKS achieves far higher throughput than exact schemes. A single CKKS ciphertext can encode and simultaneously process thousands of real numbers via SIMD packing. This generation unlocked FHE for practical machine learning workloads and is the scheme behind most deployed FHE systems today.

Noise Growth and Bootstrapping

Understanding noise growth is essential for anyone working with FHE libraries. Every operation changes the noise level in a ciphertext — and if that noise grows too large, decryption fails.

In an LWE-based ciphertext $(a, b) = (a, a c d o t s + e + m)$ , the error $e$ is small after encryption. After a homomorphic addition of two ciphertexts:

(a_{1} + a_{2},; b_{1} + b_{2}) im pl i es e_{s u m} = e_{1} + e_{2}

Addition roughly doubles the noise budget. But homomorphic multiplication is far more expensive for noise. Multiplying two ciphertexts involves a tensor product:

(a_{1}, b_{1}) o t im es (a_{2}, b_{2}) im pl i es e_{p r o d} a pp r o x e_{1} c d o t s + e_{2} c d o t s + e_{1} c d o t e_{2}

After relinearization (which uses a "relinearization key" to reduce the ciphertext dimension back to 2), the error is approximately:

∣ e_{m u l t} ∣ a pp r o x n c d o tq c d o t ∣ e_{1} ∣ c d o t ∣ e_{2} ∣/ B

where $n$ is the ring dimension, $q$ is the ciphertext modulus, and $B$ is a bound related to the relinearization key decomposition. In practice, each multiplication multiplies the noise by a factor of roughly $s q r t n c d o t s i g m a_{e x t k ey}$ . For typical CKKS parameters with $n = 2^{15}$ , each multiplication consumes about 40–60 bits of the modulus budget.

Bootstrapping is the mechanism that resets the noise level. The core idea: treat the noisy ciphertext as an input, and homomorphically evaluate the decryption function on it using an encrypted copy of the secret key (the "bootstrapping key"). The output is a fresh encryption of the same plaintext, with small noise.

The decryption function for a Ring-LWE scheme is:

m = leftlfloor rac{p cdot (b - a cdot s)}{q} ight ceil pmod{p}

Evaluating this homomorphically requires computing a dot product $a c d o t s$ where $s$ is encrypted (the bootstrapping key holds encryptions of each bit/digit of the secret key). For CKKS, bootstrapping additionally requires a slot-to-coefficient transform and coefficient-to-slot transform (using homomorphic DFT), plus approximating the modular reduction via a polynomial approximation of the round function. This is why CKKS bootstrapping takes longer than BGV/BFV bootstrapping — the modular reduction is not a polynomial and must be approximated by a high-degree polynomial.

In 2025, CKKS bootstrapping on a single CPU core takes 2–10 seconds depending on parameters. On an H100 GPU, this drops to ~200 ms. Purpose-built FHE hardware accelerators target sub-millisecond bootstrapping.

CKKS for Approximate Arithmetic — Encoding and Rescaling

CKKS encodes a vector of complex numbers into a single Ring-LWE ciphertext, allowing all elements to be processed in parallel (SIMD style). Here is the encoding in detail.

Canonical embedding. The ring $R = ma t hbb Z [x] / (x^{n} + 1)$ with $n$ a power of 2 has $n$ complex roots of unity. The canonical embedding maps a polynomial $f (x) in R$ to its evaluations at these roots:

s i g ma : R o ma t hbb C^{n}, q u a df ma p s t o (f (z e t a^{1}), f (z e t a^{3}), d o t s, f (z e t a^{2 n - 1}))

where $z e t a = e^{i p i / n}$ . Because the polynomial has real coefficients, evaluations come in conjugate pairs, so effectively $n /2$ independent complex values (or $n$ real values) can be encoded per ciphertext.

Encoding procedure. Given a vector of real numbers $z = (z_{1}, d o t s, z_{n /2}) inma t hbb R^{n /2}$ :

Apply the inverse canonical embedding to get a polynomial $m (x)$ .
Scale: $p (x) = l f l oor D e l t a c d o t m (x) ce i l$ where $D e l t a = 2^{e l l}$ (typically $e l l = 40$ to 60) is the scaling factor. This converts real-valued coefficients to large integers.
Encrypt $p (x)$ as a Ring-LWE ciphertext with modulus $q_{L}$ .

Rescaling after multiplication. When two CKKS ciphertexts with scaling factor $D e l t a$ are multiplied, the result has scaling factor $D e l t a^{2}$ — the integer representation doubles in magnitude. To keep the scaling factor stable, the rescaling operation divides the ciphertext by $D e l t a$ (rounding) and drops one level of the modulus chain:

(c_{1} \cdot c_{2}) rescale ⌊ c_{1} \cdot c_{2} /Δ ⌉ (mod q_{L - 1})

The rescaling introduces a small rounding error (bounded by $n D e l t a /2$ in coefficient norm) which manifests as a loss of precision in the decoded real numbers. With $D e l t a = 2^{40}$ , you retain about 12 decimal digits of precision after each multiplication — comparable to 64-bit double precision arithmetic.

The modulus chain. The ciphertext modulus starts at $q_{L} = q_{0} c d o t D e l t a^{L}$ for a depth- $L$ computation. Each multiplication consumes one level ( $q_{L} o q_{L - 1}$ ). When the last level is consumed, the computation is complete (or bootstrapping must be invoked to reset the modulus). Choosing $L$ large enough for the computation and $n$ large enough to maintain 128-bit security at modulus $q_{L}$ is the core parameter selection problem for CKKS.

FHE Libraries Comparison

Four libraries dominate practical FHE development. Choosing the right one depends on your language ecosystem, which scheme you need, and whether you prioritize flexibility or performance.

OpenFHE — https://github.com/openfheorg/openfhe-development
Language: C++ with Python bindings (pyOpenFHE). Schemes: BFV, BGV, CKKS, TFHE, FHEW, and their variants. The most complete open-source FHE library. Actively maintained by Duality Technologies, Samsung Research, and a community of academic contributors. NIST-standard parameter sets built in. Excellent documentation and tutorials. Best choice if you need flexibility across multiple schemes or are building research prototypes.

Microsoft SEAL — https://github.com/microsoft/SEAL
Language: C++ with Python (SEAL-Python) and .NET wrappers. Schemes: BFV and CKKS. Well-documented, stable, widely used in industry. Used internally in Microsoft Azure Confidential Computing. Does not include TFHE or bootstrapping for CKKS (as of version 4.x). Best choice for production BFV/CKKS applications in C++ or .NET environments.

TFHE-rs — https://github.com/zama-ai/tfhe-rs
Language: Rust. Scheme: TFHE with programmable bootstrapping (PBS). Developed by Zama. Extremely fast Boolean and integer operations — encrypts 8/16/32/64-bit integers and supports arithmetic directly. Powers the Zama Concrete compiler (which compiles Rust/Python functions to run on encrypted data automatically) and the fhEVM for confidential Ethereum smart contracts. Best choice for arbitrary program evaluation and smart contract applications.

Zama Concrete — https://github.com/zama-ai/concrete
Language: Python (compiler frontend), Rust (backend). Concrete is a compiler built on top of TFHE-rs. You write a Python function that operates on encrypted integers; Concrete compiles it to an FHE circuit and executes it. Removes the need to understand FHE internals. Best choice for ML engineers and data scientists who want to add privacy guarantees to existing models without learning FHE mathematics.