Former Arbitrum Tech Ambassador: Arbitrum's Component Structure (Part 1)

BeginnerFeb 27, 2024
This article provides a technical interpretation of Arbitrum One by Luo Benben (罗奔奔), former technical ambassador of Arbitrum and former co-founder of Goplus Security, a smart contract automation audit company.
Former Arbitrum Tech Ambassador: Arbitrum's Component Structure (Part 1)

Forward the Original Title:

This article provides a technical interpretation of Arbitrum One by Luo Benben (罗奔奔), former technical ambassador of Arbitrum and former co-founder of Goplus Security, a smart contract automation audit company.

Due to the lack of professional interpretation of Arbitrum and even OP Rollup in Chinese articles or materials related to Layer 2, this article attempts to fill the gap in this field by popularizing the operating mechanism of Arbitrum. Since the structure of Arbitrum itself is too complex, although the full text has been simplified as much as possible, it still exceeds 10,000 words, so it is divided into two parts. It is recommended to collect and forward it as a reference!

Rollup sequencer brief introduction

The principle of Rollup expansion can be summarized into two points:

Cost optimization: Transfer most of the computing and storage tasks to the offchain L1, that is, L2. L2 is mostly a chain running on a single server, that is, the sequencer/operator.

The sequencer is close to a centralized server in a sense. In the “impossible trinity of blockchain”, “decentralization” is abandoned in exchange for TPS and cost advantages. Users can let L2 process transaction instructions instead of Ethereum, and the cost is much lower than trading on Ethereum.

(Source: BNB Chain)

Security Guarantee: The transaction content and resulting state on Layer 2 are synchronized to the Ethereum Layer 1, where their validity is verified through contracts. Meanwhile, Ethereum retains the historical records of L2, so even if the sequencer permanently crashes, others can reconstruct the entire state of L2 from the records on Ethereum.

Fundamentally, the security of Rollup is based on Ethereum. If a sequencer does not know the private key of an account, it cannot initiate transactions on behalf of that account or tamper with the asset balance of that account (even if attempted, it would be quickly detected).

Although the sequencer, as the central hub of the system, may have a centralized hue, in mature Rollup solutions, a centralized sequencer can only engage in soft malicious behaviors such as transaction censorship or malicious crashes. However, in ideal Rollup solutions, there are corresponding measures to restrain such behaviors (such as forced withdrawals or sorting proofs as anti-censorship mechanisms).

(The Loopring protocol sets a forced withdrawal function in the contract source code on L1 for users to call)

To prevent malicious behavior by Rollup sequencers, there are two types of state verification methods: Fraud Proof and Validity Proof. Rollup solutions using Fraud Proof are called Optimistic Rollup (OPR), while those using Validity Proof are often referred to as ZK Rollup (Zero-knowledge Proof Rollup, ZKR), rather than Validity Rollup, due to historical baggage.

Arbitrum One is a typical OPR, deployed on L1 contracts, which do not actively validate the data submitted but optimistically assume that the data is correct. If there are errors in the submitted data, L2 validators will initiate challenges.

Therefore, OPR also implies a trust assumption: there is at least one honest L2 validator node at any given time. On the other hand, ZKR contracts proactively and cost-effectively verify the data submitted by the sequencer through cryptographic calculations.

(Optimistic Rollup operation method)

(ZK Rollup operation method)

This article will provide an in-depth introduction to the leading project in Optimistic Rollup — Arbitrum One, covering all aspects of the system. After careful reading, you will gain a profound understanding of Arbitrum and Optimistic Rollup (OPR).

Core Components and Workflow of Arbitrum:

Core Contracts:

The most important contracts in Arbitrum include SequencerInbox, DelayedInbox, L1 Gateways, L2 Gateways, Outbox, RollupCore, Bridge, etc. These will be detailed later.

Sequencer:

Receives user transactions and sequence them, calculates transaction results, and quickly (usually <1s) returns receipts to users. Users can typically see their transactions confirmed on L2 within seconds, providing a Web2-like experience.

Additionally, the sequencer immediately broadcasts the latest generated L2 Blocks under the Ethereum chain, which any Layer 2 node can asynchronously receive. However, at this point, these L2 Blocks do not have finality and can be rolled back by the sequencer.

Every few minutes, the sequencer compresses the sequenced L2 transaction data, aggregates them into batches, and submits them to the SequencerInbox contract on Layer 1 to ensure data availability and the operation of the Rollup protocol. Generally, L2 data submitted to Layer 1 cannot be rolled back and can have finality.

From the above process, we can summarize that Layer 2 has its own network of nodes, but these nodes are few in number and generally lack the consensus protocols commonly used in public blockchains. Therefore, their security is poor, and they must rely on Ethereum to ensure the reliability of data publishing and the validity of state transitions.

Arbitrum Rollup Protocol:

Defines the structure of blocks, called RBlocks, on the Rollup chain, the continuation of the chain, the publishing of RBlocks, and the challenge mode process, etc., through a series of contracts. It’s important to note that the Rollup chain mentioned here is not the ledger commonly understood as Layer 2, but rather an abstract “chain-like data structure” set up independently by Arbitrum One for implementing fraud proof mechanisms.

An RBlock can contain the results of multiple L2 blocks, and its data entity, RBlock, is stored in a series of contracts within RollupCore. If there is an issue with an RBlock, validators will challenge it based on submissions from the RBlock’s creator.

Validators:

Validators in Arbitrum are actually a special subset of Layer 2 full nodes, currently with whitelist admission.


Validators create new RBlocks (Rollup blocks, also called assertions) based on batches of transactions submitted to the SequencerInbox contract by the sequencer, and monitor the current state of the Rollup chain to challenge incorrect data submitted by the sequencer.

Active validators need to stake assets on the Ethereum chain in advance, and are sometimes referred to as stakers. Although Layer 2 nodes that do not stake assets can monitor the operation of the Rollup and send alerts to users about anomalies, they cannot directly intervene in incorrect data submitted by the sequencer on the Ethereum chain.

Challenge:

The basic steps can be summarized as multi-round interactive subdivision and single-step proof. In the subdivision phase, both challengers first engage in multi-round interactive subdivision of the problematic transaction data until the problematic opcode instruction is decomposed and verified. The “multi-round subdivision-single-step proof” paradigm is considered by Arbitrum developers to be the most gas-efficient implementation of fraud proof. All steps are controlled by smart contracts, and neither party can cheat.

Challenge Period:

Due to the optimistic nature of OP Rollup, after each RBlock is submitted to the chain, the contract does not actively check it, leaving a period of time for validators to falsify it. This time window is the challenge period, which is one week on the Arbitrum One mainnet. After the challenge period ends, the RBlock will be finally confirmed, and the messages corresponding to the transactions from L2 to L1 (such as withdrawal operations executed through the official bridge) can be released.

ArbOS, Geth, WAVM:

Arbitrum uses a virtual machine called AVM, which consists of Geth and ArbOS. Geth is the most commonly used client software for Ethereum, and Arbitrum has made lightweight modifications to it. ArbOS is responsible for all L2-related special functions, such as network resource management, generating L2 blocks, and working in coordination with EVM. We consider the combination of both as a Native AVM, which is the virtual machine used by Arbitrum. WAVM is the result of compiling AVM code into Wasm. In the Arbitrum challenge process, the final “single-step proof” verifies WAVM instructions.

Here, we can represent the relationships and workflows between the various components using the diagram below:

L2 Transaction Lifecycle

The processing flow of an L2 transaction is as follows:

  1. The user sends transaction instructions to the sequencer.
  2. The sequencer first verifies the data, including digital signatures, of the transactions to be processed, filters out invalid transactions, and then sequences and computes them.
  3. The sequencer sends the transaction receipt to the user (usually very quickly), but this is only the “pre-processing” done by the sequencer on the Ethereum chain, and it is in a state of Soft Finality and is not reliable. However, for users who trust the sequencer (most users), they can optimistically assume that the transaction has been completed and will not be rolled back.
  4. The sequencer encapsulates the pre-processed transaction data into a Batch after highly compressing it.
  5. At regular intervals (affected by factors such as data volume and Ethereum congestion), the sequencer publishes the transaction Batch to the Sequencer Inbox contract on L1. At this point, it can be considered that the transaction has Hard Finality.

Sequencer Inbox Contract

The contract receives batches of transactions submitted by the sequencer to ensure data availability. In-depth, the batch data in the Sequencer Inbox fully records the transaction input information of Layer2. Even if the sequencer permanently crashes, anyone can restore the current state of Layer2 based on the records of the batch and take over the failed/missing sequencer.

In a physical analogy, what we see as L2 is only the projection of the batch in the Sequencer Inbox, while the light source is the Soft Finality. Because the light source Soft Finality does not change easily, the shape of the shadow is determined only by the batch acting as the object.

The Sequencer Inbox contract is also called a fast box, and the sequencer specifically submits pre-processed transactions to it, and only the sequencer can submit data to it. The corresponding slow box is the Delayer Inbox, whose function will be described in the subsequent process.

Validators will continuously monitor the Sequencer Inbox contract. Every time the sequencer publishes a Batch to this contract, an on-chain event is triggered. Upon detecting this event, the validator will download the batch data, execute it locally, and then publish an RBlock to the Rollup protocol contract on the Ethereum chain.

The Arbitrum bridge contract has a parameter called the accumulator, which records information about the newly submitted L2 batch and the number of transactions received on the slow Inbox, among other things.


(The sequencer continuously submits batches to SequencerInbox)

(The specific information of the Batch, the data field corresponds to the Batch data. The size of this part of the data is very large, and the screenshot is not fully displayed.)

The SequencerInbox contract has two main functions:

add Sequencer L2Batch From Origin(),The sequencer will call this function every time to submit Batch data to the Sequencer Inox contract.

force Inclusion(),This function can be called by anyone and is used to implement censorship-resistant transactions. The way this function takes effect will be explained in detail later when we talk about the Delayed Inbox contract.

The above two functions will call “bridge.enqueueSequencerMessage()” to update the accumulator parameter accumulator in the bridge contract.

Gas pricing

Obviously, L2 transactions cannot be free, because this will lead to DoS attacks. In addition, there are operating costs for the sequencer L2 itself, and there will be overhead for submitting data on L1. When a user initiates a transaction within the Layer 2 network, the gas fee structure is as follows:

The cost of data publication generated by occupying Layer1 resources mainly comes from the batches submitted by the sequencer (each batch contains transactions from many users), and the cost is ultimately shared among the transaction initiators. The pricing algorithm for transaction fees generated by data publication is dynamic. The sequencer adjusts the pricing based on recent profit and loss conditions, batch size, and the current Ethereum gas price.

The cost incurred by users for occupying Layer2 resources is determined by setting a maximum limit on the gas processed per second to ensure the stable operation of the system (currently Arbitrum One is 7 million). The gas guidance prices for both L1 and L2 are tracked and adjusted by ArbOS, and the formula is not elaborated here.

Although the specific gas price calculation process is relatively complicated, users do not need to be aware of these details and can clearly feel that Rollup transaction fees are much cheaper than the ETH mainnet.

Optimistic fraud proof

Looking back at the previous text, it’s apparent that Layer2 (L2) is essentially just a projection of the transaction input batches submitted by the sequencer in the Sequencer Inbox:

Transaction Inputs -> State Transition Function (STF) -> State Outputs

The inputs are already determined, and the STF is immutable, so the output result is also determined. The fraud proof and Arbitrum Rollup protocol system publish the output state, represented by an RBlock (also known as an assertion), to Layer1 and provide optimistic proofs for it.

On L1, there are both input data published by the sequencer and output states published by validators. Upon careful consideration, is it necessary to publish the state of Layer2 on-chain? Because the inputs have already fully determined the outputs, and the input data is publicly visible, submitting the output results or state seems redundant. However, this idea overlooks the fact that there needs to be a settlement of states between the L1 and L2 systems. This is especially necessary for withdrawal actions from L2 to L1, which require proof of state.

When building Rollup, the core idea is to offload most computations and storage to L2 to avoid the high costs of L1. This implies that L1 does not know the state of L2; it only assists the L2 sequencer in publishing the input data for all transactions but is not responsible for calculating the state of L2.

Withdrawal actions essentially involve unlocking the corresponding assets from the L1 contract based on the cross-chain messages provided by L2 and transferring them to the user’s L1 account or completing other tasks.

At this point, the Layer1 contract will inquire: what is your state on Layer2, and how do you prove that you truly own the assets you’re claiming to transfer? At this stage, users need to provide corresponding Merkle Proofs, etc.

Therefore, if we build a Rollup without a withdrawal function, it is theoretically possible not to synchronize the state to L1, and there is no need for a state proof system such as fraud proof (although it may cause other problems). But in real applications, this is obviously not feasible.

In the so-called optimistic proof, the contract does not check whether the output status submitted to L1 is correct, but optimistically believes that everything is accurate.The optimistic proof system assumes that there is at least one honest Validator at any time. If an incorrect state occurs, it will be challenged through a fraud proof.

The advantage of this design is that there is no need to actively verify every RBlock issued to L1 to avoid wasting gas. In fact, for OPR, it is unrealistic to verify every assertion, because each Rblock contains one or more L2 blocks, and each transaction must be re-executed on L1. It is no different from executing L2 transactions directly on L1, which loses the meaning of Layer 2 scalability.

ZKR does not face this issue because ZK Proofs have succinctness, requiring only validation of a small proof without the need to actually execute the many transactions behind the proof. Therefore, ZKR does not operate optimistically; each state publication is accompanied by mathematical verification by a Verifier contract.

Although fraud proofs cannot achieve the high level of succinctness as zero-knowledge proofs, Arbitrum employs a “multi-round subdivision - single-step proof” interactive process, where ultimately only a single virtual machine opcode needs to be proven, resulting in relatively lower costs.

Rollup protocol

Let’s first take a look at the entrance to initiate challenges and start proofs, that is, how the Rollup protocol works.

The core contract of the Rollup protocol is RollupProxy.sol. While ensuring that the data structure is consistent, a rare dual agent structure is used. One agent corresponds to two implementations of RollupUserLogic.sol and RollupAdminLogic.sol, which cannot be well parsed by tools such as Scan.

MoreoverThe ChallengeManager.sol contract is responsible for managing challenges, and the OneStepProver series contracts are used to determine fraud proofs.

(Source: L2BEAT official website)

In the RollupProxy, a series of RBlocks (also known as assertions) submitted by different validators are recorded, represented by blocks in the diagram: green for confirmed, blue for unconfirmed, and yellow for disproved.

The RBlock contains the final state resulting from the execution of one or more L2 blocks since the previous RBlock. These RBlocks form a Rollup Chain in appearance (note the distinction from the L2 ledger itself). In an optimistic scenario, this Rollup Chain should not have forks, as forking implies validators submitting conflicting Rollup Blocks.

To propose or agree to an assertion, validators need to stake a certain amount of ETH, becoming a Staker. This ensures the economic basis for honest behavior among validators, as the loser’s stake will be forfeited in case of challenge/fraud proof.

The blue block numbered 111 in the bottom right corner of the diagram will ultimately be disproved because its parent block, block number 104, is incorrect (yellow).

Additionally, Validator A has proposed Rollup Block 106, which Validator B disagrees with and challenges.

After B initiates the challenge, the ChallengeManager contract is responsible for verifying the segmentation of the challenge steps:

  1. Segmentation is a process in which both parties take turns to interact. One party segments the historical data contained in a certain Rollup Block, and the other party points out which part of the data fragment is problematic. A process similar to the dichotomy (actually N/K) that continuously and gradually narrows the scope.

  2. Subsequently, it can be further pinpointed which transaction and its results are problematic, then further subdivided to the specific machine instruction within that transaction that is disputed.

  3. The ChallengeManager contract only verifies if the “data segment” generated after subdividing the original data is valid.

  4. When the challenger and the challenged party identify the machine instruction to be challenged, the challenger invokes oneStepProveExecution() to send a single-step fraud proof, demonstrating that the execution result of this machine instruction is flawed.

One-step proof

Single-step proof is the core of the entire Arbitrum fraud proof. Let’s take a look at what the single-step proof specifically proves.

This requires understanding WAVM first. Wasm Arbitrum Virtual Machine is a virtual machine compiled by ArbOS module and Geth (Ethereum client) core module. Since L2 is very different from L1, the original Geth core had to be lightly modified and work with ArbOS.

Therefore, the state transition on L2 is actually the joint work of ArbOS+Geth Core.

Arbitrum’s node client (sequencer, validator, full node, etc.) compiles the above-mentioned ArbOS+Geth Core processing program into native machine code that the node host can directly process (for x86/ARM/PC/Mac/etc .).

If you change the target language obtained after compilation to Wasm, you will get the WAVM used by the verifier when generating fraud proofs, and the contract to verify the single-step proof also simulates the functions of the WAVM virtual machine.

So why does it need to be compiled into Wasm bytecode when generating fraud proofs? The main reason is that to verify the contract of single-step fraud proof, it is necessary to use the Ethereum smart contract to simulate a virtual machine VM that can handle a certain set of instruction sets, and WASM is easy to implement simulation on the contract.

However, WASM runs slightly slower than Native machine code, so Arbitrum’s nodes/contracts will use WAVM only when generating and verifying fraud proofs.

After the previous rounds of interactive subdivisions, the single-step proof finally proved the single-step instruction in the WAVM instruction set.

As can be seen in the code below, OneStepProofEntry first determines which category the operation code of the instruction to be proved belongs to, and then calls the corresponding prover such as Mem, Math, etc., to pass the single-step instruction into the prover contract.

The final result afterHash will be returned to the ChallengeManager. If the hash is inconsistent with the hash after the instruction operation recorded on the Rollup Block, the challenge is successful. If they are consistent, it means that there is no problem with the execution result of this command recorded on the Rollup Block, and the challenge failed.

Disclaimer:

  1. This article is reprinted from [Geek Web3], All copyrights belong to the original author [Luo Benben, former Arbitrum technical ambassador, geek web3 contributor]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!
Create Account