The Ingenious ZK Application: Tornado Cash

BeginnerFeb 28, 2024
This article introduces privacy projects represented by Tornado, which truly utilize the zero-knowledge property of the ZK-SNARK algorithm, while most projects under the ZK banner only use the succinctness of ZK-SNARK. Often, people confuse the difference between Validity Proof and ZK, and Tornado serves as an excellent example to understand the application of ZK.
The Ingenious ZK Application: Tornado Cash

Introduction: Recently, Vitalik and some scholars co-published a new paper, mentioning how Tornado Cash implements an anti-money laundering scheme (essentially allowing the withdrawer to prove that their deposit record belongs to a set that does not contain dirty money), but the paper lacks a detailed interpretation of the business logic and principles of Tornado Cash, leaving some readers puzzled.

It is worth mentioning that privacy projects represented by Tornado are the ones that truly utilize the zero-knowledge property of the ZK-SNARK algorithm, while most projects labeled with ZK only use the succinctness of ZK-SNARK. People often confuse the difference between Validity Proof and ZK, and Tornado serves as an excellent case to understand the application of ZK. The author of this article had written about the principles of Tornado in 2022 for Web3Caff Research, and today selects and expands on some sections from that article, organizing it into this piece to systematically understand Tornado Cash.

Original Article Link: https://research.web3caff.com/zh/archives/2663?ref=157

The principle of “Tornado”

Tornado Cash utilizes a coin mixing protocol based on zero-knowledge proofs, with its older version launched in 2019 and a new beta version released towards the end of 2021. The older version of Tornado achieved a high level of decentralization, with its on-chain contracts being open source and not controlled by any multisignature mechanism, and its frontend code also being open source and backed up on the IPFS network. Due to the simpler and more understandable structure of the old version of Tornado, this article focuses on explaining it. The main idea behind Tornado is to mix a large number of deposit and withdrawal actions together. After depositing tokens in Tornado, depositors present a ZK Proof to prove they have made a deposit and then withdraw to a new address, thereby severing the link between deposit and withdrawal addresses.


More specifically, Tornado acts like a glass box filled with coins deposited by many people. We can see who deposited the coins, but since the coins are highly homogenized, it’s difficult to trace back which coin was deposited by whom if someone unfamiliar withdraws a coin.


(Source: rareskills)This scenario is somewhat common; for instance, when we swap ETH in a Uniswap pool, we cannot know whose ETH we are getting because many have provided liquidity to Uniswap. However, the difference is that every time you swap tokens on Uniswap, you need to use other tokens as the equivalent cost, and you cannot “privately” transfer funds to someone else; whereas, with a mixer, you only need to show a deposit proof to withdraw. To make the deposit and withdrawal actions appear homogenous, every deposit into a Tornado pool and every withdrawal from it is kept consistent in amount. For example, if there are 100 depositors and 100 withdrawers in a pool, although visible, they appear unlinked, and the amount deposited and withdrawn by each is the same.


This can obscure the traceability of fund transfers, offering a natural convenience for anonymizing transactions. The key question is: how does a withdrawer prove they have made a deposit?

The address making the withdrawal is not linked to any deposit addresses, so how can their eligibility for withdrawal be determined? The most direct method seems to be for the withdrawer to disclose which deposit record is theirs, but this would directly reveal their identity. This is where zero-knowledge proofs come into play. By presenting a ZK Proof that they have a deposit record in the Tornado contract that has not yet been withdrawn, a withdrawer can successfully initiate a withdrawal. Zero-knowledge proofs inherently protect privacy, revealing only that the person has indeed made a deposit into the fund pool, without disclosing which depositor they correspond to.


To prove “I have made a deposit in the Tornado fund pool” can be translated to “My deposit record can be found in the Tornado contract.” If we use Cn to represent a deposit record, the problem becomes: given that Tornado’s set of deposit records is {C1, C2, …C100…}, the withdrawer, Bob, proves he used his key to generate some Cn in the deposit records without revealing which specific Cn it is. This involves the special property of Merkle Proof. All of Tornado’s deposit records are incorporated into a Merkle Tree constructed on-chain, with these records as its bottom-level leaf nodes.The total number of leaves is about 2^20 > 1 million, most of which are in a blank state (assigned an initial value). Whenever a new deposit occurs, the contract records its unique value, Commitment, into a leaf, and then updates the Merkle Tree’s root.


For example, if Bob’s deposit was the 10,000th in Tornado’s history, the characteristic value Cn associated with this deposit would be entered into the 10,000th leaf node of the Merkle Tree, i.e., C10000 = Cn. The contract then automatically calculates a new Root and updates it. To save on computational resources, the Tornado contract caches data from a batch of previously changed nodes, such as Fs1, Fs2, and Fs0 in the diagram below.


(Source: RareSkills)

Merkle Proof, by its nature, is concise and lightweight, leveraging the simplicity of tree data structures in search/tracing processes. To externally prove that a transaction TD exists in a Merkle Tree, one only needs to provide a Merkle Proof corresponding to the Root, which is quite straightforward. If the Merkle Tree is especially large, with 2^20 power bottom-level leaves (i.e., 1 million deposit records), a Merkle Proof would only need to include the values of 21 nodes, which is very short.


To prove that a transaction H3 is indeed contained within a Merkle Tree, one must demonstrate that using H3 and other parts of data on the Merkle Tree, the Root can be generated, and the data needed to generate the Root (including Td) constitutes the Merkle Proof. When Bob withdraws, he needs to prove that his certificate corresponds to a deposit hash Cn recorded on the Merkle Tree in Tornado’s ledger. In other words, he must prove two things: Cn exists within the on-chain Tornado’s fictional Merkle Tree, specifically by constructing a Merkle Proof containing Cn; Cn is associated with Bob’s deposit certificate.

Tornado’s Business Logic Explained

In the frontend code of the Tornado user interface, many functionalities are pre-implemented. When a depositor opens the Tornado Cash webpage and clicks the deposit button, the accompanying frontend code generates two random numbers, K and r, locally. It then calculates the value of Cn=Hash(K, r) and passes Cn (referred to as the commitment in the diagram below) to the Tornado contract, which inserts it into the Merkle Tree recorded by the latter. Essentially, K and r act as private keys. They are crucial, and the system prompts users to save them securely. K and r are needed again during withdrawal.


(The option of encryptedNote allows users to encrypt the credential K and r with a private key and store it on the blockchain to prevent forgetting) Importantly, all these operations occur off-chain, meaning the Tornado contract and external observers are unaware of K and r. If K and r are leaked, it’s akin to the theft of wallet private keys.


Upon receiving a deposit from a user and the submission of Cn=Hash(K, r), the Tornado contract records Cn at the bottom layer of the Merkle Tree as a new leaf node, also updating the Root value. Therefore, Cn is directly linked to the user’s deposit action, allowing outsiders to know which user corresponds to each Cn, who has deposited tokens into the mixer, and the deposit records Cn of each depositor.

During the withdrawal process, the withdrawer enters the credential/private key (the random numbers K and r generated during deposit) on the frontend webpage. The program in the Tornado Cash frontend code uses K and r, Cn=Hash(K, r), and the Merkle Proof corresponding to Cn as input parameters to generate a ZK Proof. This proves that Cn exists in the Merkle Tree as a deposit record, and K and r are the credentials corresponding to Cn. This step essentially proves: I know the key corresponding to a deposit record on the Merkle Tree. When the ZK Proof is submitted to the Tornado contract, these four parameters are concealed, protecting privacy. The generation of ZK Proof involves additional parameters, including the Merkle root recorded in the Tornado contract at the time of withdrawal, a custom recipient address A, and an identifier nf to prevent replay attacks. These three parameters are publicly posted on the blockchain, which does not compromise privacy.


A detail to note is the use of two random numbers, K and r, to generate Cn instead of a single random number, providing increased security against collisions. A represents the withdrawal recipient address, chosen by the withdrawer. The identifier nf, designed to prevent replay attacks, is calculated as nf=Hash(K), where K is one of the two random numbers used in the deposit step to generate Cn. This links nf directly to Cn, establishing a one-to-one correlation between each Cn and its corresponding nf. The purpose of preventing replay attacks is due to the mixer’s design feature, which keeps the association between withdrawal amounts and specific Merkle tree leaves (Cn) unknown, allowing for the potential abuse of repeated withdrawals until the fund pool is depleted.


The nf identifier functions similarly to the nonce associated with each Ethereum address, preventing transaction replays. When a withdrawal occurs, a check ensures the submitted nf has not been used before; if unused, the withdrawal is valid, and the nf is recorded. Any attempt to generate an nf not associated with any recorded deposit Cn will fail to produce a valid ZK Proof, rendering the withdrawal unsuccessful.

If someone randomly generates an unrecorded non-fungible (nf) contract, would it work? Of course not. When the withdrawer generates a Zero-Knowledge Proof (ZK Proof), they must ensure that nf equals the Hash (K), where the random number K is associated with the deposit record Cn. This means that nf is linked to a recorded deposit Cn. If someone fabricates an nf arbitrarily, this nf won’t match any deposit records, making it impossible to generate a valid ZK Proof. Consequently, the withdrawal process cannot be completed successfully, and the withdrawal operation will fail. Some might ask: Is it possible to proceed without nf? Since the withdrawer needs to submit a ZK proof during the withdrawal to prove their association with a certain Cn, why not just check if the corresponding ZK Proof has been submitted to the blockchain every time a withdrawal occurs? However, this approach is highly costly because the Tornado Cash contract does not permanently store past ZK Proofs due to significant storage space wastage. Comparing every new ZK Proof submitted to the blockchain with existing proofs is less efficient than setting a small identifier like nf and permanently storing it.

According to the withdrawal function’s code example, the required parameters and business logic are as follows: The user submits a ZK Proof and nf (NullifierHash) = Hash (K), specifies a recipient address for the withdrawal, and the ZK Proof conceals the values of Cn, K, and r, making it impossible for outsiders to determine the user’s identity. The recipient often uses a new, clean address, which does not reveal personal information.


However, there is a minor issue: when users withdraw to remain untraceable, they often initiate the withdrawal transaction from a newly created address. At this time, the new address lacks ETH to pay for gas fees. Therefore, when initiating a withdrawal, the withdrawal address must explicitly declare a relayer to pay the gas fee on its behalf. Afterward, the mixer contract deducts a portion from the user’s withdrawal to compensate the relayer.

In summary, Tornado Cash can obscure the connection between withdrawers and depositors. In situations with a large user base, it is like a criminal blending into a crowd in a busy area, making it difficult for the police to track. The withdrawal process involves the use of ZK-SNARKs, with the hidden witness part containing critical information about the withdrawer, which is a key aspect of the entire mixer. Currently, Tornado appears to be one of the most ingenious application-layer projects related to ZK.

Disclaimer:

  1. This article is reprinted from [Geek Web3], All copyrights belong to the original author [Faust, geek web3]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!
Create Account