Vitalik Buterin: How Does zk-SNARKs Technology Protect Privacy?

IntermediateDec 03, 2023
This article delves into the workings of zk-SNARK technology, its applicability in current applications, and elaborates on the challenges and potential capabilities of this technology in real-world scenarios.
Vitalik Buterin: How Does zk-SNARKs Technology Protect Privacy?

zk-SNARKs are a powerful cryptographic tool that has become an increasingly essential part of both blockchain and non-blockchain-based applications. Their complexity is evident both in understanding how they function and in how they can be effectively utilized. This article delves into how zk-SNARKs adapt to current applications, provides examples of what they can and cannot achieve, and offers general guidelines on when zk-SNARKs are suitable for specific applications. A particular emphasis will be on their role in ensuring privacy.

What are zk-SNARKs?

Imagine having a public input x, a private input w, and a (public) function f(x,w) → {True,False}, which validates the inputs. With zk-SNARKs, one can prove that they know a w, such that for a given f and x, f(x,w) = True, all without revealing what w actually is. Moreover, verifiers can authenticate the proof much faster than they could compute f(x,w) even if they knew w.

This endows zk-SNARKs with two attributes: privacy and scalability. As mentioned, our examples in this article will mainly focus on the privacy aspect.

Proof of Membership

Suppose you have an Ethereum wallet and you want to prove that this wallet is registered under a proof-of-humanity system, without disclosing who the registered individual actually is. This function can be mathematically described as:

Private input (w): Your address A, your address private key k

Public input (x): Address set of all verified proof-of-humanity profiles {H1…Hn}

Verification function f(x,w):

Interpret w as a pair (A, σ) and x as a valid profile list {H1…Hn}

Verify A is one of the addresses in {H1…Hn}

Confirm privtoaddr(k) = A

If both verifications are passed, return True. If any one of them fails, return False.

The prover generates their address A and the associated key k, providing w=(A,k) as the private input for f. They obtain the public input, which is the current set of verified proof-of-humanity profiles {H1…Hn}, from the chain. They then run the zk-SNARK proof algorithm, which (assuming the input is correct) produces a proof. This proof is sent to the verifier, along with the block height where they fetched the verified profile list.

The verifier also reads the chain, acquiring the list {H1…Hn} from the specified height by the prover, and checks the proof. If the verification is successful, the verifier believes the prover possesses a verified proof-of-humanity profile.

Before delving into more intricate examples, I strongly recommend fully understanding the above example.

Making Membership Proofs More Efficient

One drawback of the aforementioned proof system is that the verifier needs to be aware of the entire profile set {H1…Hn}, which requires O(n) time to “input” this set into the zk-SNARK mechanism. This issue can be addressed by using the on-chain Merkle root, which encompasses all profiles, as the public input (potentially just the state root). We add another private input, a Merkle proof M, confirming that the prover’s account A is in the pertinent part of the tree.

A highly recent and more efficient alternative to Merkle proofs for ZK membership proofs is Caulk. In the future, some of these use cases might transition to solutions similar to Caulk.

ZK-SNARKs and Coins

Projects like Zcash and Tornado.cash enable us to possess privacy-protecting currencies. Now, one might think they could utilize the mentioned “ZK human proofs”, but it’s not about proving access to human profile proofs; it’s about proving access to coins. Here lies a problem: we must address both privacy and double-spending simultaneously. That is, we shouldn’t spend the same coin twice.

Here’s how we solve it: Anyone who possesses a coin has a private secret “s”. They compute a “leaf” L=hash(s,1) locally, which gets published on the chain, becoming a part of the state, N=hash(s,2), which we call a nullifier. The state is stored in a Merkle tree.

To spend a coin, the sender must produce a ZK-SNARK where:

  • Public inputs include a nullifier N, the current or recent Merkle root R, and a new leaf L’ (intended for the recipient who has a secret s’ passed to the sender as L’=hash(s’,1)).

  • Private inputs comprise a secret s, a leaf L, and a Merkle branch M.

The verification function checks:

  • M is a valid Merkle branch, proving L is a leaf of the tree rooted at R, where R is the Merkle root of the current state.

  • hash(s,1)=L and hash(s,2)=N.

The transaction contains the nullifier N and the new leaf L’. We don’t actually prove anything about L’, but it’s “mixed” into the proof to prevent modification by third parties during the transaction. To validate the transaction, the chain verifies the ZK-SNARK and checks if N has been used in prior transactions. If successful, N is added to the set of spent nullifiers, rendering it unusable again. L’ is added to the Merkle tree.

Using ZK-SNARKs, we link two values: L (appearing on-chain when the coin is minted) and N (appearing when spent), without revealing which L connects to which N. The connection between L and N is only discernible when you know the secret s that generates them. Each minted coin can be used once (as there’s only one valid N for each L), but the specific coin used at any time remains concealed.

This is a crucial primitive to grasp. Many mechanisms we describe below are based on this, though with varied purposes.

Coins with Arbitrary Balances

The above can be easily extended to coins with arbitrary balances. We maintain the concept of a “coin”, but each coin carries a (private) balance. A straightforward way to achieve this is for each coin to have chain storage, not just with leaf L, but also with an encrypted balance. Each transaction would consume two coins and create two new ones, adding two pairs (leaf, encrypted balance) to the state. The ZK-SNARK also verifies that the sum of input balances equals the sum of output balances, and both output balances are non-negative.

ZK Anti-Denial-of-Service

An intriguing anti-DOS tool: Imagine you have some on-chain identity that isn’t easily created; it could be a human-proof profile or 32 ETH validators, or just an account with a non-zero ETH balance. We could create a more DOS-resistant peer-to-peer network by only accepting messages that prove the sender has a profile. Each profile would be allowed up to 1000 messages per hour. If a sender cheats, their profile gets removed from the list. But how do we ensure privacy?

First, the setup: let k be the user’s private key, and A=privtoaddr(k) the corresponding address. A list of valid addresses is public (e.g., an on-chain registry). So far, this mirrors the human-proof example: you must prove you hold the private key to an address without revealing which one. But we don’t just want proof you’re on the list. We need a protocol that lets you prove you’re on the list but limits your proofs.

We’ll divide time into periods: each lasting 3.6 seconds (making 1000 periods per hour). Our goal is to let each user send only one message per period; if they send two in the same period, they’re caught. To allow for occasional bursts, they can use recent periods. So, if a user has 500 unused periods, they can send 500 messages all at once.

Protocol

Let’s start with a basic version using nullifiers. A user generates a nullifier with (N = \text{hash}(k, e)) where (k) is their key, and (e) is an epoch number, then publishes it with message (m). ZK-SNARK then obfuscates the (\text{hash}(m)). Nothing about (m) is verified in this process, thus tying the proof to a single message. If a user binds two proofs to two distinct messages using the same nullifier, they risk getting caught.

Now, we shift to a more intricate version. In this scenario, the subsequent protocol reveals their private key rather than merely confirming if someone has used the same epoch twice. Our pivotal technique hinges on the principle that “two points determine a line.” Revealing one point on a line discloses little, but exposing two points unveils the entire line.

For every epoch (e), we choose a line (L_e(x) = \text{hash}(k, e) \times x + k). The slope of the line is (\text{hash}(k, e)), and the y-intercept is (k), both unknown to the public. To produce a certificate for message (m), the sender provides (y = L_e(\text{hash}(m)) = \text{hash}(k, e) \times \text{hash}(m) + k), along with a ZK-SNARK proof that the calculation of (y) is accurate.

Summarizing, ZK-SNARK is as follows:

Public Inputs:

  • ({A_1…A_n}): A list of valid accounts

  • (M): The message being validated by the certificate

  • (E): The epoch number for the certificate

  • (Y): Evaluation of the line function

Private Input:

  • (K): Your private key

Verification Function:

  • Check if (\text{privtoaddr}(k)) exists in ({A_1…A_n})

  • Confirm (y = \text{hash}(k, e) \times \text{hash}(m) + k)

But what if someone uses an epoch twice? They would reveal two values (m_1) and (m_2) along with their certificate values (y_1 = \text{hash}(k, e) \times \text{hash}(m_1) + k) and (y_2 = \text{hash}(k, e) \times \text{hash}(m_2) + k). We can then utilize these two points to recover the line and subsequently the y-intercept, which is the private key.

So, if someone reuses an epoch, they inadvertently reveal their private key to everyone. Depending on the context, this might lead to funds theft or merely broadcasting the private key and integrating it into a smart contract, resulting in the associated address’s removal.

A viable off-chain anonymous anti-denial-of-service system, suitable for blockchain peer-to-peer networks, chat apps, and similar systems, demands no proof of work. The RLN project essentially focuses on this concept, although with minor alterations (namely, they utilize both nullifiers and the two-point line technique, making it simpler to detect instances where an epoch is reused).

ZK’s Negative Reputation

Imagine establishing 0chan, an online forum like 4chan that offers complete anonymity (you don’t even have permanent names), but with a reputation system to promote higher quality content. Such a system could have a governance DAO to flag posts that breach system rules, introducing a three-strike mechanism.

The reputation system can cater to positive or negative reputations; however, accommodating negative reputation demands additional infrastructure. This requires users to incorporate all reputation data into their proofs, even if it’s negative. We will primarily focus on this challenging use-case, akin to what Unirep Social aims to implement.

Linked Post: Basic Knowledge

Anyone can post by transmitting a message on the chain that contains the post, accompanied by a ZK-SNARK. This ZK-SNARK serves as a proof that (i) you possess a unique external identity which grants you permission to create an account, or (ii) you have previously published specific posts. Specifically, ZK-SNARK functions as follows:

Public Inputs:

  • Nullifier, N

  • Most recent blockchain state root, R

  • Post content (‘mixed’ into the proof to bind it to the post, without any computations on it)

Private Inputs:

  • Your private key, k

  • An external identity (address A) or the nullifier, Nprev, used in the previous post

  • A Merkle proof, M, that the chain contains A or Nprev

  • The ith post you’ve published using this account

Verification Function:

  1. Confirm that M is a valid Merkle branch, proving that (A or Nprev, whichever is provided) is a leaf of a tree with root R.

  2. Verify N = enc(i, k) where enc is an encryption function (e.g., AES).

  3. If i=0, verify A=privtoaddr(k), otherwise verify Nprev=enc(i−1,k).

Besides proof validation, the chain also checks (i) that R is actually a recent state root, and (ii) that the nullifier N has not been used before. Up to this point, it resembles previously described privacy-preserving coins, but we’ve added a process to ‘mint’ new accounts and removed the capability to ‘send’ your account to different keys. Instead, all nullifiers are generated using the original key. We employ enc here to make the nullifier reversible. If you have key k, you can decrypt any specific nullifier on-chain; if the result is a valid index rather than random gibberish (e.g., we can just check dec(N) < 2^64), you’d know the nullifier was generated using key k.

Adding Reputation:

In this scheme, reputation is on-chain and explicit. Some smart contracts have a method called addReputation, which takes the nullifier released with a post and the number of reputation units to be added or subtracted as input.

We’ve extended the data stored on-chain for each post. Instead of storing just the nullifier N, we store {N, h¯, u¯} where:

  • h¯ = hash(h, r) where h represents the block height of the state root referenced in the proof.

  • u¯ = hash(u, r) where u is the reputation score of the account (starting at 0 for new accounts).

R here is a random value added to prevent h and u from being found through brute force searching. In cryptographic terms, adding R makes the hash a concealed commitment.

Assume a post uses root R and stores {N, h¯, u¯}. Within its proof, it links to a previous post that stored the data {Nprev, h¯prev, u¯prev}. The post’s proof also has to traverse all reputation entries posted between hprev and h. For each nullifier N, the verification function decrypts N using user’s key k. If decryption outputs a valid index, it applies the reputation update. If the total of all reputation updates equals δ, then it proves u = uprev + δ.

If we wish to implement a “three strikes and you’re out” rule, ZK-SNARK would also ensure u > -3. If we want a rule where a post with rep ≥ 100 receives a special “high-reputation post” tag, that can be done too.

To enhance the scalability of this system, we can divide it into two types of messages: Posts and RCAs. A post would be off-chain, though it requires pointing to an RCA made in the past week. RCAs would be on-chain, traversing all reputation updates since the publisher’s previous RCA. This way, the on-chain load is reduced to one transaction per post per week, plus one transaction for each reputation message.

Holding Centralized Parties Accountable

Sometimes, there’s a need to design a system with a centralized “operator.” The reasons for this might vary: at times, it’s for scalability, and at others, it’s for privacy, especially the privacy of data held by the operator. For instance, the MACI coercive-resistant voting system requires voters to submit their votes on-chain, encrypted with a key held by a centralized operator. This operator decrypts all on-chain votes, tallies them, and displays the final outcome. They use ZK-SNARK to prove that everything they did was accurate. This added complexity is crucial to ensure robust privacy (known as coercive resistance): users can’t prove to anyone how they voted, even if they wanted to. Thanks to blockchain and ZK-SNARK, our trust level in the operator remains minimal. Malicious operators can breach coercive resistance, but since votes are posted on the blockchain, they can’t cheat by censoring votes. And since they must provide a ZK-SNARK, they can’t cheat by miscalculating results.

Combining ZK-SNARKs with MPC

A more advanced use of ZK-SNARKs is in computations where proof is required, with input distributed among two or more parties, and we wouldn’t want any party to learn about the others’ input. In a two-party scenario, garbled circuits can fulfill privacy requirements; for N parties, more intricate multi-party computation protocols can be used. ZK-SNARKs can be integrated with these protocols for verifiable multi-party computations. This enables advanced reputation systems, allowing multiple participants to execute joint calculations on their private inputs. The math to effectively achieve this is still in its early stages.

What can’t we make private?

ZK-SNARKs is highly effective in creating systems where users have private states. However, it can’t maintain a state as private that nobody knows. For information to be proven, the prover must know it in plaintext. Uniswap is an example that’s hard to privatize. In Uniswap, there’s a central logical “entity” – the liquidity provider account, which doesn’t belong to anyone, and all Uniswap transactions occur with this account. You can’t hide the state of this account since someone needs to hold this state in plaintext to prove it, and each transaction would require their active involvement. You could use ZK-SNARK’s garbled circuits to make a centralized, secure, private version of Uniswap, but it’s unclear if the benefits would outweigh the costs. It might not even offer any real benefits: contracts need to inform users of asset prices, and per-block price shifts can reveal transaction activity. Blockchains can globalize state information, and ZK-SNARKs can privatize it, but there’s no solid method to globalize and privatize state info simultaneously.

Putting the primitives together

In the sections above, we saw examples of tools that are powerful in their own right but can also serve as building blocks for other applications. Nullifiers, vital for currencies, now reappear in other use cases. The “coercive linking” technique used in the negative reputation section has a broad application. It’s highly effective for many apps where a user’s “profile” changes over time in intricate ways, and you want to compel users to follow system rules while preserving privacy. Users could even be tasked with representing their internal “state” using a full private Merkle tree. The “commitment pool” tool mentioned can be built with ZK-SNARK. If certain apps can’t function fully on-chain and require a centralized operator, the very same techniques can keep the operator honest. ZK-SNARK is a potent tool, blending accountability and privacy benefits. While they do have their limitations, in some cases, clever application designs can bypass these constraints. I hope to see more applications adopt ZK-SNARK and eventually build applications that meld ZK-SNARK with other forms of encryption in the coming years.

Disclaimer:

  1. This article is reprinted from[Foresightnews]. All copyrights belong to the original author [Jacob Ko]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!
Create Account