Arbitrum’s Component Structure Interpreted by the Former Arbitrum Technical Ambassador (Part 1)

AdvancedJan 17, 2024
This article attempts to fill the gap in the field by providing an understanding of the operational mechanism of Arbitrum, focusing on the technical interpretation of Arbitrum One.
Arbitrum’s Component Structure Interpreted by the Former Arbitrum Technical Ambassador (Part 1)

As there is a lack of professional interpretation of articles or materials related to Layer2 protocols, especially for Arbitrum and OP Rollup, in the Chinese community, this article aims to fill this gap by explaining the operational mechanism of Arbitrum. Due to the complexity of Arbitrum itself, the full text, even after being simplified as much as possible, still exceeds 10,000 words. Therefore, it is divided into two parts, and it is recommended to be saved and shared as a reference material.”

Overview of Rollup Sequencer

The principle of Rollup scaling can be summarized in two points:

Cost Optimization: Transfer most of the computation and storage tasks to Layer 2, which is operated under L1. L2 mostly runs on a single server, referred to as a sequencer/operator, forming a separate chain.

The sequencer appears almost like a centralized server in terms of perception, abandoning decentralization in the ‘impossible trinity of blockchain’ to gain advantages in TPS and cost. Users can use L2 to handle transaction instructions instead of Ethereum, with much lower costs compared to transactions on Ethereum.

(Source: BNB Chain)

Security Assurance: Transaction content and post-transaction state on L2 are synchronized to Ethereum L1, and their validity is verified through contracts for state transition. At the same time, Ethereum retains the history of L2, so even if the sequencer permanently crashes, others can reconstruct the entire state of L2 through Ethereum records.

Fundamentally, the security of Rollup relies on Ethereum. If the sequencer doesn’t know the private key of an account, it cannot initiate transactions on behalf of that account or manipulate the account’s asset balance (even if attempted, it would be quickly detected).

While the sequencer, as a central hub, has a centralized aspect, in mature Rollup solutions, the centralized sequencer can only engage in soft malicious activities, such as transaction censorship or malicious crashes. In an ideal Rollup scenario, there are corresponding measures to restrain such actions, such as compulsory withdrawals or sequencing proofs for anti-censorship mechanisms.

(The Loopring protocol sets a forced withdrawal function in the contract source code on L1 for users to call)

The state verification mechanisms to prevent malicious actions by Rollup sequencers are divided into two categories: Fraud Proof and Validity Proof. Rollup solutions using Fraud Proof are referred to as OP Rollup (Optimistic Rollup, OPR), while those using Validity Proof are often called ZK Rollup (Zero-knowledge Proof Rollup, ZKR) rather than Validity Rollup due to historical baggage.

Arbitrum One is a typical OPR, deployed on L1 with contracts that do not actively validate the submitted data, optimistically assuming that the data is correct. If there is an error in the submitted data, L2 validator nodes will actively initiate a challenge.

Therefore, OPR implies a trust assumption: at any given time, there is at least one honest L2 validator node. On the other hand, ZKR contracts actively and inexpensively validate the data submitted by the sequencer through cryptographic calculations.

(Optimistic Rollup operation method)

(ZK Rollup operation method)

This article provides an in-depth introduction to the leading project in Optimistic Rollup—Arbitrum One, covering all aspects of the entire system. After reading it carefully, you will have a profound understanding of Arbitrum and Optimistic Rollup/OPR.

Core Components and Workflow of Arbitrum

Core Contracts:

The most crucial contracts in Arbitrum include SequencerInbox, DelayedInbox, L1 Gateways, L2 Gateways, Outbox, RollupCore, Bridge, etc. These will be detailed in the following sections.

Sequencer:

The sequencer receives user transactions, sorts them, calculates the transaction results, and quickly (usually <1s) returns receipts to users. Users often see their transactions on L2 within a few seconds, providing an experience similar to Web2 platforms.

Simultaneously, the sequencer immediately broadcasts the latest generated L2 Block under Ethereum off-chain. Any Layer2 node can asynchronously receive these L2 Blocks. However, these L2 Blocks do not have finality at this point and can be rolled back by the sequencer.

Every few minutes, the sequencer compresses the sorted L2 transaction data, aggregates it into batches, and submits it to the SequencerInbox contract on Layer1 to ensure data availability and the operation of the Rollup protocol. Generally, L2 data submitted to Layer1 cannot be rolled back and can have final determinism.

From the above process we can summarize: Layer2 has its own node network, but the number of these nodes is sparse, and there is generally no consensus protocol commonly used by public chains, so it provides very poor security. It must rely on Ethereum to ensure the reliability of data release and the effectiveness of state transitions. sex.

Arbitrum Rollup Protocol:

This is a series of contracts that define the structure of the Rollup chain’s block RBlock, the chain’s continuation method, the release of RBlock, and the challenge mode process. Note that the Rollup chain mentioned here is not the Layer 2 ledger that everyone understands, but an abstract “chain-like data structure” independently set up by Arbitrum One in order to implement the fraud proof mechanism.

One RBlock can contain the results of multiple L2 blocks, and the data is also very different. Its data entity RBlock is stored in a series of contracts in RollupCore. If there is a problem with an RBlock, the Validator will challenge the submitter of the RBlock.

Validator:

Arbitrum’s validator nodes are actually a special subset of Layer 2 full nodes, and currently have access to a whitelist.


Validator creates a new RBlock (Rollup block, also called assertion) based on the transaction batch submitted by the sequencer to the SequencerInbox contract, as well as monitor the status of the current Rollup chain and challenge the incorrect data submitted by the sequencer.

Active Validator needs to stake assets on the ETH chain in advance. Sometimes we also call it Staker. Although Layer 2 nodes that do not stake can also monitor the operation dynamics of Rollup and send abnormal alarms to users, they cannot directly intervene on the error data submitted by the sequencer on the ETH chain.

Challenge:

The basic steps can be summarized as multiple rounds of interactive segmentation and one-step proof. In the segmentation process, the challenging parties first conduct multiple rounds of segmentation on the problematic transaction data until they decompose the problematic operation code instructions and conduct verification. The paradigm of “multiple rounds of segmentation-one-step proof” is considered by Arbitrum developers to be the most gas-saving implementation of fraud proof. All links are under contract control, and no party can cheat.

Challenge period:

Due to the optimistic nature of OP Rollup, after each RBlock is submitted to the chain, the contract does not actively check, leaving a window period for the verifier to falsify. This window period is the challenge period, which takes 1 week on the Arbitrum One main network. After the challenge period ends, the RBlock will be finally confirmed. Only the corresponding messages passed from L2 to L1 within the block (such as withdrawal operations performed through the official bridge) can be released.

ArbOS, Geth, WAVM:

The virtual machine used by Arbitrum is called AVM, which includes Geth and ArbOS. Geth is the most commonly used client software in Ethereum, and Arbitrum has made lightweight modifications to it. ArbOS is responsible for all L2-related special functions, such as network resource management, generating L2 blocks, working with EVM, etc. We regard the combination of the two as a Native AVM, which is the virtual machine used by Arbitrum. WAVM is the result of compiling AVM code into Wasm. In the Arbitrum challenge process, the last “one-step proof” verifies the WAVM instruction.

Here, we can use the following figure to represent the relationship and workflow between the above components:

Transaction life cycle on L2

The processing flow of a transaction on L2 is as follows:

  1. Users send trading instructions to the sequencer.

  2. The sequencer first verifies the transactions to be processed into digital signatures and other data, eliminates invalid transactions, and performs sequence and calculations.

  3. The sequencer sends the transaction receipt to the user (usually very fast), which is just the “preprocessing” performed by the sorter under the ETH chain. It is in the state of Soft Finality and is not reliable. But for users (most users) who trust the sequencer, they can be optimistic that the transaction has been completed and will not be rolled back.

  4. The sequencer highly compresses the preprocessed original transaction data and encapsulates it into a Batch.

  5. Every once in a while (affected by factors such as data volume and ETH congestion), the sequencer will publish transaction batches to the Sequencer Inbox contract on L1. At this point, it can be considered that the transaction has Hard Finality.

Sequencer Inbox Contract

The contract will receive the transaction batch submitted by the sequencer to ensure data availability. If we look at this in a deeper way, the batch data in Sequencer Inbox completely records the transaction input information of Layer2. Even if the sequencer is permanently down, anyone can restore the current state of Layer 2 based on the batch record and replace the faulty/running sequencer.

To understand it in a physical way, the L2 we see is just the projection of the batch in SequencerInbox, and the light source is STF. Because the light source STF does not change easily, the shape of the shadow is only determined by the batch acting as the object.

The Sequencer Inbox contract is called a fast box. The sequencer specifically submits preprocessed transactions to it, and only the sequencer can submit data to it. The corresponding fast box is the slow box Delayer Inbox, and its function will be described in the subsequent process.

Validator will always monitor the Sequencer Inbox contract. Whenever the sequencer releases a batch to the contract, an on-chain event will be produced. After the Validator detects the occurrence of this event, it will download the batch data. After local execution, RBlock will be issued to the Rollup protocol contract on the ETH chain.

Arbitrum’s bridge contract has a parameter called the accumulator, which records the newly submitted L2 batch, as well as the number and information of newly received transactions on the slow Inbox.


(The sequencer continuously submits batches to Sequencer Inbox)

(The specific information of the Batch; the data field corresponds to the Batch data. The size of this part of the data is very large, and the screenshot is not fully displayed.)

The SequencerInbox contract has two main functions:

add Sequencer L2Batch From Origin()

The sequencer will call this function every time to submit Batch data to the Sequencer Inox contract.

force Inclusion()

This function can be called by anyone and is used to implement censorship-resistant transactions. The way this function takes effect will be explained in detail later when we talk about the Delayed Inbox contract.

The above two functions will call ‘bridge.enqueueSequencerMessage()’ to update the accumulator parameter in the bridge contract.

Gas pricing

Obviously, L2 transactions cannot be free, because this will lead to DoS attacks. In addition, there are operating costs for the sorter L2 itself, and there will be overhead for submitting data on L1. When a user initiates a transaction within the Layer 2 network, the gas fee structure would be as follows:

Data publishing costs incurred by occupying Layer1 resources

This cost mainly comes from the batches submitted by the sequencer (each batch has many user transactions), and the cost is ultimately shared equally among the transaction initiators. The fee pricing algorithm generated by data release is dynamic, and the sequencer will price based on the recent profit and loss status, batch size, and current Ethereum gas price.

The cost incurred by users occupying Layer 2 resources

A gas limit for TPS is established to ensure the stable operation of the system (currently 7 million in Arbitrum One). Gas guidance prices for both L1 and L2 are tracked and adjusted by ArbOS, and the formula is not detailed here for now.

Although the specific gas price calculation process is relatively complicated, users do not need to be aware of these details and can clearly feel that Rollup transaction fees are much cheaper than the ETH mainnet.

Optimistic fraud proof

Recalling the above, L2 is actually just the projection of the transaction input batch submitted by the sequencer in the fast box, that is:

Transaction Inputs -> STF -> State Outputs. The input has been determined and the STF is unchanged, so the output result is also determined. The system of fraud proof and Arbitrum Rollup protocol is a system that publishes the output state root in the form of RBlock (aka assertion) to L1 and performs optimistic proof on it.

On L1, there is input data published by the sequencer and output status published by the verifier. Let’s consider it carefully: Is it necessary to publish the status of Layer 2 to the chain?

Because the input has completely determined the output, and the input data is publicly visible, does it seem redundant to submit the output result-state? But this idea ignores the actual need for state settlement between the two systems L1 and L2, that is, the withdrawal behavior of L2 to L1 requires proof of the state.

When building Rollup, one of the core ideas is to put most of the computing and storage on L2 to avoid the high cost of L1. This means that L1 does not know the status of L2, it only helps L2. The sequencer publishes the input data of all transactions, but is not responsible for calculating the state of L2.

The withdrawal behavior is essentially to unlock the corresponding funds from the L1 contract, transfer it to the user’s L1 account or complete other things by following the cross-chain message given by L2.

At this time, the Layer1 contract will ask: What is your status on Layer 2, and how to prove that you really own the assets you declare to be transferred? At this time, the user must provide the corresponding Merkle Proof, etc.

Therefore, if we build a Rollup without a withdrawal function, it is theoretically possible not to synchronize the state to L1, and there is no need for a state proof system such as fraud proof (although it may cause other problems). But in real applications, this is obviously not feasible.

In the so-called optimistic proof, the contract does not check whether the output status submitted to L1 is correct, but optimistically believes that everything is accurate. The optimistic proof system assumes that there is at least one honest Validator at any time. If an incorrect state occurs, it will be challenged through fraud proof.

The advantage of this design is that there is no need to actively verify every RBlock issued to L1 to avoid wasting gas. In fact, for OPR, it is unrealistic to verify every assertion, because each Rblock contains one or more L2 blocks, and each transaction must be re-executed on L1. It is no different from executing L2 transactions directly on L1, which makes Layer 2 expansion meaningless.

ZKR does not have this problem, because ZK Proof is concise. It only needs to verify a very small Proof, and there is no need to actually execute many transactions corresponding to the Proof. Therefore, ZKR does not operate optimistically. Every time the status is released, there will be a Verifier contract for mathematical verification.

Although fraud proof cannot be as concise as zero-knowledge proof, Arbitrum uses a “multi rounds of segmentation-one-step proof” turn-based interactive process. In the end, what needs to be proved is only a single virtual machine operation code, and the cost is relatively small.

Rollup protocol

Let’s first take a look at the entrance to initiate challenges and start proofs, that is, how the Rollup protocol works.

The core contract of the Rollup protocol is RollupProxy.sol, which uses a rare dual agent structure while ensuring that the data structure is consistent. One agent corresponds to two implementations of RollupUserLogic.sol and RollupAdminLogic.sol, which cannot be well parsed by tools such as Scan.

Moreover, the ChallengeManager.sol contract is responsible for managing challenges, and the OneStepProver series contracts are used to determine fraud proofs.

(Source: L2BEAT official website)

This shows recording a series of RBlocks (aka assertions), in RollupProxy, submitted by different Validators, which are the boxes in the figure below: Green-confirmed, blue-unconfirmed, yellow-falsified.

RBlock contains the final state after the execution of one or more L2 blocks since the last RBlock. These RBlocks form a formal Rollup Chain (note that the L2 ledger itself is different). Under optimistic circumstances, this Rollup Chain should have no forks, because a fork means that a Validator has submitted conflicting Rollup Blocks.

To propose or agree with an assertion, the verifier needs to first stake a certain amount of ETH for the assertion and become a Staker. In this way, when a challenge/fraud proof occurs, the loser’s collateral will be slashed. This is the economic basis to ensure the honest behavior of the verifier.

The blue block No. 111 in the lower right corner of the picture will eventually be slashed because its parent block No.104 (yellow) is wrong.

In addition, verifier A proposed Rollup Block No. 106, but B disagreed and challenged it.

After B initiates the challenge, the ChallengeManager contract will be responsible for verifying the segmentation of the challenge steps:

  1. Segmentation is a process in which both parties take turns to interact. One party segments the historical data contained in a certain Rollup Block, and the other party points out which part of the data fragment is problematic, a process similar to the dichotomy (actually N/K) that continuously and gradually narrows the scope.

  2. After that, you can continue to locate the transaction and the result that are problematic, and then further subdivide it into a disputed machine instruction in the transaction.

  3. The ChallengeManager contract only checks whether the “data fragments” generated after segmenting the original data are valid.

  4. When the challenger and the challenged party locate the machine instruction that will be challenged, the challenger calls ‘oneStepProveExecution()’ function and sends a one-step fraud proof to prove that there is a problem with the execution result of this machine instruction.

One-step proof

One-step proof is the core of the entire Arbitrum fraud proof. Let’s take a look at what the one-step proof specifically proves.

This requires understanding WAVM first. Wasm Arbitrum Virtual Machine is a virtual machine compiled by ArbOS module and Geth (Ethereum client) core module. Since L2 is very different from L1, the original Geth core had to be lightly modified and work with ArbOS.

Therefore, the state transition on L2 is actually the joint work of ArbOS+Geth Core.

Arbitrum’s node client (sequencer, verifier, full node, etc.) compiles the above-mentioned ArbOS+Geth Core processing program into native machine code that the node host can directly process (for x86/ARM/PC/Mac/etc .).

If you change the target language obtained after compilation to Wasm, you will get the WAVM used by the verifier when generating fraud proofs, and the contract to verify the one-step proof also simulates the functions of the WAVM virtual machine.

So why does it need to be compiled into Wasm bytecode when generating fraud proofs? The main reason is that, to verify the contract of one-step fraud proof, it is necessary to use the Ethereum smart contract to simulate a virtual machine VM that can handle a certain set of instruction sets, and WASM is easy to implement simulation on the contract.

However, WASM runs slightly slower than Native machine code, so Arbitrum’s nodes/contracts will use WAVM only when generating and verifying fraud proofs.

After the previous rounds of interactive segmentations, the one-step proof finally proved the one-step instruction in the WAVM instruction set.

As can be seen in the code below, OneStepProofEntry first determines which category the operation code of the instruction to be proved belongs to, and then calls the corresponding prover such as Mem, Math, etc., to pass the one-step instruction into the prover contract.

The final result afterHash will be returned to the ChallengeManager. If the hash is inconsistent with the hash after the instruction operation recorded on the Rollup Block, the challenge is successful. If they are consistent, it means that there is no problem with the execution result of this instruction recorded on the Rollup Block, and the challenge failed.

In part 2, we will analyze Arbitrum and even the contract module that handles cross-chain messaging/bridging functions between Layer2 and Layer1, and further clarify how a true Layer2 should achieve censorship resistance.

Disclaimer:

  1. This article is reprinted from [WeChat]. All copyrights belong to the original author [Luo Benben]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!
Create Account