Sigma Protocols
Ivan Damgard’s lecture notes take the Schnorr protocol – which you might already know as the simplest “I know a secret” proof – and show that it’s not a one-off trick. It’s an instance of a general framework called sigma-protocols. And these protocols can be composed, transformed, and extended in ways that solve real problems in authentication, digital signatures, and privacy-preserving systems. The notes read like a toolkit manual: here are the parts, here’s what each one does, and here’s what you can build when you snap them together.
Let’s walk through the toolkit.
The Formal Definition: What Makes a Sigma-Protocol
Every sigma-protocol is a three-move conversation between a prover P (the student) and a verifier V (the examiner). The student claims to know a witness w for some public statement x – think of it as claiming you know a secret spell that corresponds to a publicly posted result.
The three moves are always the same shape:
- Commitment (a): The student performs a preliminary magical gesture – commits to something before knowing what the question will be.
- Challenge (e): The examiner asks a random question.
- Response (z): The student answers.
For this to qualify as a sigma-protocol for a relation R, three properties must hold:
Completeness. If the student really knows the spell and both sides follow the rules, the examiner always says “pass.” No honest wizard fails a fair exam.
Special Soundness. This is the cheat-detection mechanism – we’ll spend real time on it below.
Honest-Verifier Zero-Knowledge (HVZK). There exists a simulator – a machine that, given x and a challenge e, can produce a fake transcript (a, e, z) that looks exactly like a real conversation. The simulator doesn’t know w. It’s as if someone could forge a video of the exam that’s indistinguishable from the real thing.
One important caveat Damgard is upfront about: HVZK only guarantees privacy when the verifier plays fair and picks e randomly. Against a malicious verifier who picks e adversarially, the Schnorr protocol is not known to be zero-knowledge. But HVZK protocols are still enormously useful as building blocks – you can compose them into things that are secure against active cheating, as we’ll see with the OR-construction.
Special Soundness: “Catch Me Twice, You’ve Got Me”
Here’s the analogy. The headmaster makes you perform your spell in front of two enchanted mirrors. Each mirror asks a different question. If you can answer both from the same starting position – the same opening gesture, the same commitment – then the headmaster can reverse-engineer your spell by comparing your two answers. That sounds terrifying, but here’s the thing: in a real exam, you only face one mirror. You perform once, get one question, give one answer, and leave. Your secret stays safe.
So why does this help? Because it catches fakers. A student who doesn’t actually know the spell can rehearse a convincing answer for one specific question. But they can’t handle two different questions from the same starting position. Two answers from the same commitment expose the fraud.
Now the math. Suppose a prover produces two accepting transcripts for the same commitment a:
(a, e, z) where g^z = a * h^e mod p
(a, e', z') where g^z' = a * h^e' mod p
Divide one verification equation by the other:
g^(z - z') = h^(e - e') mod p
Since e != e’ and everything lives in a group of prime order q, the value (e - e’) has a multiplicative inverse mod q. So:
w = (z - z') * (e - e')^(-1) mod q
The secret is extracted. Done. Any prover who can answer two different challenges for the same commitment must know w. A cheater who doesn’t know w can answer at most one challenge correctly per commitment, giving a cheating probability of 1/2^t where t is the challenge bit-length.
This is the extraction argument that makes soundness concrete. It’s what separates sigma-protocols from protocols that merely “seem hard to cheat” – here, cheating is provably equivalent to solving the underlying hard problem.
The Knowledge Extractor: “Rewinding the VHS Tape”
Special soundness tells us that two transcripts with the same commitment break the secret open. But in a real protocol execution, the prover only answers one challenge. How do you get two transcripts?
Think of it this way. You’re watching a student perform their exam through a magic window. They commit, get challenged, respond, and pass. Now you hit rewind on the VHS tape. The student forgets everything that happened after their commitment – they’re back in the same internal state, same randomness, same opening gesture. You ask a different question. If they pass again, you now have two accepting transcripts from the same commitment, and special soundness hands you the secret.
A real wizard doesn’t care about being rewound. They know the spell. They’ll pass no matter what you ask. A faker, though, prepared for one specific question. Rewind and ask something different, and they’re caught.
This is how the formal proof of soundness works – literally rewinding the prover’s computation. Given a prover P* that succeeds with probability epsilon:
- Run P*, get a successful transcript (a, e, z)
- Rewind P* to the same internal state (same randomness)
- Send a different challenge e’
- If P* succeeds again, extract w via special soundness
Once the extractor has two transcripts (a, e, z) and (a, e’, z’), the math is identical to special soundness – it’s the same formula:
w = (z - z') * (e - e')^(-1) mod q
The extractor doesn’t have its own separate algebra. Its contribution is the strategy for obtaining those two transcripts. Special soundness is the lock-picking formula; the extractor is the scheme for getting your hands on two keys to compare.
The interesting math is in the efficiency analysis – how long until you find two hits from the same commitment? Damgard models this as a matrix problem. Build a matrix H where rows are the prover’s random choices (rho) and columns are possible challenges (e). Set H[rho, e] = 1 if the verifier accepts, 0 otherwise. The fraction of 1s in the whole matrix equals epsilon (the prover’s overall success probability).
Call a row “heavy” if at least epsilon/2 of its entries are 1. By a counting argument, more than half of all 1s live in heavy rows. So when the extractor’s first probe lands on a 1, there’s probability > 1/2 that it’s in a heavy row. In a heavy row, the expected number of probes to find a second 1 is:
T = 2^t / (epsilon/2 * 2^t - 1) <= 4/epsilon
With probability > 1/2 the first hit is in a heavy row, and then finding a second hit takes O(1/epsilon) probes. The whole procedure – including retries when the first hit lands in a non-heavy row – runs in expected time O(1/epsilon).
OR-Proofs: “I’m One of Two Suspects”
This is one of the most elegant constructions in the notes, and it unlocks a surprising amount of power.
The setup: Alice and Bob are both accused of pulling a prank at magic school. Alice actually did it – she knows the spell that caused the explosion in the potions lab. Bob is innocent. Alice wants to prove to the headmaster that one of them did it, without revealing which one.
Here’s what she does. For her own side, she runs the real sigma-protocol honestly – she knows the witness, she can answer any challenge. For Bob’s side, she uses the HVZK simulator to generate a fake transcript. Remember, the simulator can produce transcripts that are indistinguishable from real ones without knowing the witness. She sends both commitments to the headmaster.
The headmaster sends a random string s. Alice splits it as e_0 XOR e_1 = s – she already chose the simulated challenge for Bob’s fake side, so she derives the real challenge for her own side from the constraint. She responds honestly on her side, attaches the simulated response on Bob’s side, and sends everything back.
The headmaster checks both conversations and verifies that e_0 XOR e_1 = s. Everything checks out. But he can’t tell which side was real.
This protocol is witness indistinguishable (WI): even a cheating verifier who picks s adversarially cannot determine which of the two statements the prover actually knows a witness for. WI is weaker than full zero-knowledge – the verifier might learn that something happened – but they can’t tell which secret was used. And crucially, the OR-construction achieves WI starting from a protocol that was only HVZK. That’s a real upgrade.
Real-world connection: Ring signatures, as used in Monero, are essentially this idea scaled up. A transaction is signed by “one of these 10 keys” – but which one? Nobody knows. The OR-proof structure is what makes that possible.
Witness Indistinguishability: “The Identical Twins Defense”
This concept deserves its own moment because it shows up everywhere once you start looking.
Imagine identical twins at magic school. Both are wizards. Both know the same category of spell. One of them walks into the exam room and performs. You’re watching from outside. You see a perfect performance, a passed exam. But you cannot tell which twin it was.
That’s witness indistinguishability. If a relation has multiple valid witnesses – multiple secrets that all satisfy the public statement – then a WI protocol guarantees that no verifier, no matter how malicious, can determine which witness the prover used.
It’s weaker than full zero-knowledge. In ZK, the verifier learns nothing beyond the truth of the statement. In WI, the verifier might learn that the statement is true (which, admittedly, they were supposed to learn), but they can’t distinguish between possible explanations. For many applications, this is exactly enough. You don’t need to hide that a transaction happened – you need to hide who made it.
Identification Schemes and the Mafia Attack: “The Earpiece Scam”
Damgard constructs a full identification scheme from sigma-protocols (Section 7). Users U_1, …, U_n each get a key pair (x_i, w_i) from a generator G. To identify yourself, you run the sigma-protocol as prover. Simple.
The security argument: if an adversary A can impersonate user U_i after watching them authenticate, then A must be convincing enough that the knowledge extractor can pull w_i out of A. But that contradicts the assumption that the underlying problem is hard.
But here’s where it gets interesting – and where Damgard is honest about the limits.
The Mafia Attack. You’re at a restaurant. You hand your card to a waiter (corrupt verifier). The waiter has an earpiece connected to an accomplice at an electronics store. The accomplice is at the payment terminal, pretending to be you. Every message the real terminal sends, the accomplice relays through the earpiece to the waiter, who passes it to your card. Your card responds. The response goes back through the earpiece to the accomplice. The terminal is satisfied. You just bought someone a TV.
This is a relay attack, and the basic sigma-protocol can’t prevent it because the prover has no way to tell which verifier they’re talking to. The messages just pass through.
The fix: Give verifiers public keys too, and have the prover use the OR-construction to prove “I know w_i OR I know the verifier’s key w_j.” Now the proof is targeted. Only the intended verifier finds it convincing, because anyone else could have faked the whole thing using their own key. The corrupt waiter’s relay becomes useless – the proof was crafted for the real terminal, not the electronics store terminal.
Deniability: “Off the Record”
Here’s an observation that’s easy to miss but has deep implications for protocol design.
You could use digital signatures for identification. Sign the challenge, verify the signature. It works, and it only takes two messages instead of three. So why bother with sigma-protocols?
Think of it this way. A signature is a notarized letter. If a bouncer at a bar asks you to prove you’re over 21, and you hand him a notarized letter from the government, he can photocopy it and show it to anyone. “Look, this person was at my bar on Friday night.” You can’t deny it – the notarization is independently verifiable.
A sigma-protocol transcript is more like a conversation that happened at the door. The bouncer challenged you, you responded, he was satisfied and let you in. But if he tries to tell someone about it later, they shrug. “You could have made up that whole conversation. There’s a simulator that produces transcripts just like that.” The bouncer has nothing to show. The conversation was deniable.
This is a real design consideration. Sigma-protocols give you deniability that signatures don’t. If you care about the prover’s long-term privacy – if you don’t want authentication events to leave permanent, transferable evidence – you want the three-move interactive protocol, not the two-move signature-based shortcut.
Commitment Schemes: “The Sealed Envelope”
Sigma-protocols can be turned inside out to build commitment schemes (Section 9). The analogy is the oldest one in the book, but it’s the right one.
You and a friend are betting on a coin flip. You want to commit to your guess before the flip, but you don’t want your friend to see your guess until after. So you write “heads” on a slip of paper, seal it in an envelope, and hand it over. Your friend flips the coin. You open the envelope.
The two security properties:
- Hiding: Your friend can’t see through the envelope. Before it’s opened, the commitment reveals nothing about what’s inside.
- Binding: You can’t switch the paper after sealing. Once committed, you’re stuck with what you wrote.
In sigma-protocol terms:
- Commit: Run the HVZK simulator on input (x, e) to get (a, e, z). Send a as the commitment.
- Open: Reveal e and z. The verifier checks that (a, e, z) is an accepting conversation.
This gives perfectly hiding commitments (the commitment a reveals nothing about e, since the simulator’s output distribution is independent of e) and computationally binding commitments (opening to two different values would require two accepting conversations with the same a but different challenges – which by special soundness would let you extract w, contradicting the hardness of R).
The Fiat-Shamir Transform: “Replacing the Teacher with an Exam Sheet”
Everything so far has been interactive. The student and examiner go back and forth. But what if you want to take the exam alone and mail in your answer sheet?
That’s the Fiat-Shamir heuristic (Section 10). Instead of a live teacher picking a random question, you hash your commitment to generate the question:
e = H(a)
You compute your commitment a, hash it to get the challenge e, compute the response z, and send (a, z) to anyone who wants to verify. They recompute e = H(a) and check the verification equation.
The key insight: you can’t choose your commitment to get an easy question, because changing a changes H(a) unpredictably. It’s as if the exam sheet was printed by a machine that reads your opening gesture and spits out a question you couldn’t have anticipated. You have to commit before you know what you’ll be asked, just like in the interactive version.
In the random oracle model – where H is modeled as a truly random function – this is provably secure:
- The prover can’t predict H(a) before committing to a, so they have no advantage over facing a live verifier
- The challenge is always “honestly” chosen (by the oracle), so HVZK automatically upgrades to full zero-knowledge
- A cheating prover would need to query the oracle exponentially many times to find a favorable challenge
This is exactly how Schnorr signatures work. The signature on a message m is computed with e = H(a, m), and the signature is the pair (e, z) – since a can be recomputed from e, z, and the public key. The interactive proof of knowledge becomes a non-interactive signature scheme. Three moves collapse to zero interaction.
The caveat, and Damgard doesn’t hide it: nobody can prove that any concrete hash function (SHA-256, SHAKE, whatever) behaves like a random oracle. The random oracle is an idealization. But a security proof in the random oracle model rules out all known attack strategies, and practically speaking, that’s the best assurance we have for deployed systems.
The Sigma-Protocol Toolkit
Damgard’s notes are, at their core, a modular toolkit. Each piece is simple. The power is in composition.
| Component | What it gives you |
|---|---|
| Basic sigma-protocol (Schnorr) | Proof of knowledge for discrete log |
| Special soundness | Extractability – the prover must know the witness |
| Knowledge extractor (rewinding) | Formal proof that soundness holds against any prover |
| HVZK simulator | Privacy against honest verifiers; building block for everything else |
| OR-construction | Prove one-of-two without revealing which; upgrades HVZK to witness indistinguishability |
| Witness indistinguishability | Can’t tell which witness was used – enables ring signatures, anonymous credentials |
| Commitment schemes | Hiding + binding from any sigma-protocol |
| Fiat-Shamir / random oracle | Non-interactive proofs and digital signatures |
| Identification with OR | Targeted authentication, resistant to relay/mafia attacks |
| Deniability | Interactive proofs leave no transferable evidence, unlike signatures |
Start with a three-move protocol for a simple relation. Add special soundness and you get cheat-detection. Add a simulator and you get privacy. Compose with OR and you get anonymity. Apply Fiat-Shamir and you get signatures. Bolt on targeted verification and you get relay-attack resistance. Every piece snaps together because they all share the same (a, e, z) skeleton.
That’s the real lesson of these notes. Sigma-protocols aren’t just a proof technique. They’re a design language for building cryptographic systems, and once you internalize the three-move pattern, you start seeing it everywhere.
References
- Damgård, Ivan — “On Σ-Protocols” (CPT 2010, v.2)