Message traps in the Arbitrum bridge

I wasn't too fond of writing this article. Not that I didn't have other choice. I could have let it go, moved on.

But that wouldn't have been fair to me. Nor to you.

So here I am, joined by Jaar in the background, and this million-th run of a proof of concept that just won't finish running.

Almost too reluctant to admit that a message bomb can burn all ETH in my Arbitrum relayer.

How did I get here ?

Article cover

Ethereum moves at an incredibly fast pace. The knowledge on how to build bridges has already faded to ancient wisdom. Hasn't it ? It's unclear, somewhat hazy, passed down through oral tradition, and requires a great deal of faith.

Still, all L2s have found ways to set up communications between Ethereum and their domain.

A bridge is a two-way communication channel that allows you to send a message on Ethereum and receive it on L2, and vice versa. However, these paths are not the same; they have different mechanisms, participants, and security risks.

I find it fascinating to dig into the assumptions, optimizations and compromises each L2 project makes to get bridges right. But what is "right" anyway, right ?

There's no official manual instructing how L1 <> L2 bridges must operate. Let alone be implemented with secure, production-ready code. This is a craft we're only beginning to master.

It relies on practical intuition, sane software engineering practices, and experience. The first two, less common than you'd wish. The last, gained from rekt.news and Twitter threads.

Regardless, the question isn't whether bridges can be built - because they can. We're now trying to figure what makes them secure.

Of all L2 bridges, let's discuss Arbitrum. I was already familiar with the bridge of Optimism - the competing optimistic rollup. So I was craving to unearth some hidden gems in Arbitrum's code.

Under the assumption that Ethereum is safe, and Arbitrum is less safe (beta, arbitrarily upgraded, less decentralized), it's critical that users can always exit Arbitrum and find refuge in Ethereum. The infrastructure that makes L2-to-L1 messages possible must be secure.

That's why I set out to explore L2-to-L1 message passing in Arbitrum. My goal: determine how safe it is - for all parties involved - to operate in the bridge. If it came to it, I was ready to privately disclose any relevant findings to the team.

Spoiler alert: it came to it. Though not in the way I had expected.

Let's not rush it. First I have to introduce a few ideas behind L2-to-L1 messages in Arbitrum.

The essence of L2-to-L1 messages

This should be quick.

Arbitrum's L2-to-L1 message passing is briefly explained in their docs. In essence, you have three stages:

In L2, declare you want to execute stuff on L1.
In real life, wait.
In L1, someone executes the stuff you declared you wanted to execute.

Told you it was quick. Details may be not.

1. Declare you want to execute stuff on L1

The L2-to-L1 communication flow starts with a transaction on L2. In the transaction, the signer states that they want to execute a message on L1. Think of a message as a piece of calldata intended to be executed on an account in L1. Both calldata and the target could be whatever - the bridge is clever enough to handle arbitrary messages.

So how do you create this transaction ?

Calling the sendTxToL1 function of the ArbSys precompile. This is a special contract in Arbitrum, stored at address 0x0000000000000000000000000000000000000064.

It's bytecode is:

cast code --rpc-url $ARBITRUM_RPC 0x0000000000000000000000000000000000000064
0xfe

Lol, "fe" means "faith" in spanish.

Ok sorry, the precompile. You cannot read its real code like that. It's like the precompiles in Ethereum for ecrecover and such. The executable code is kept in Arbitrum's nodes.

For ArbSys, that's in the ArbSys.go file (see the SendTxToL1 function). Not that it really matters though.

It's easier to treat the ArbSys precompile as a black box. And interact with it as if it were a regular contract. The interface is defined in the ArbSys.sol file. There you'll find the sendTxToL1 external function:

/**
* @notice Send a transaction to L1
* @dev it is not possible to execute on the L1 any L2-to-L1 transaction which contains data
* to a contract address without any code (as enforced by the Bridge contract).
* @param destination recipient address on L1
* @param data (optional) calldata for L1 contract call
* @return a unique identifier for this L2-to-L1 transaction.
*/
function sendTxToL1(address destination, bytes calldata data)
    external
    payable
    returns (uint256);

Upon execution, this function will emit the L2ToL1Tx event. It logs some data to ease the future validation and execution of the message on L1.

event L2ToL1Tx(
    address caller,
    address indexed destination,
    uint256 indexed hash,
    uint256 indexed position,
    uint256 arbBlockNum,
    uint256 ethBlockNum,
    uint256 timestamp,
    uint256 callvalue,
    bytes data
);

Putting these pieces together, it's easy to build a script that submits L2-to-L1 messages in Arbitrum:

const { ethers } = require("hardhat");

async function sendMessageToL1() {
    const provider = ethers.getDefaultProvider(process.env.ARBITRUM_RPC);

    const arbsys = new ethers.Contract(
        "0x0000000000000000000000000000000000000064",
        ["function sendTxToL1(address target, bytes data) payable returns (uint256)"],
        new ethers.Wallet(process.env.PRIVATE_KEY, provider)
    );

    const target = ""; // address of an account in L1
    const data = []; // calldata to execute on target
    await arbsys.sendTxToL1(target, data);
}

sendMessageToL1().catch((error) => {
  console.error(error);
  process.exitCode = 1;
});

So far so good. Set the target, set the calldata, and the L2-to-L1 message is ready to go.

2. Wait

Once submitted, you have to wait around 1 week before executing the message on L1. This is mostly due to Arbitrum's dispute window period.

Even after the dispute window, L2-to-L1 messages are not automatically executed on Ethereum. Somebody, like an incentivized relayer, has to grab the message and execute it. Sending a transaction on L1.

Thus we reach the third and final stage. Beware, danger awaits.

3. Execute the stuff you declared you wanted to execute

The wait is over. The message is ready to be executed on L1.

Now a relayer must craft a transaction that wraps the message in a special package. Including additional data that the L1-side of the bridge requires to receive, verify, and execute it.

Something like this:

Sketchy diagram that shows part of the L2-to-L1 message passing flow explained so far.

As we're about to see, the key steps that lead to execution of the message in L1 happen in two smart contracts. The Outbox and the Bridge.

Diagram highlighting the outbox and bridge contracts

The Outbox exposes the main entrypoint for the relayer to trigger message execution. Namely, the external executeTransaction function.

screenshot of code showing executeTransaction function

The first operation hashes all relevant data to build a unique identifier for the message.

Then, the internal recordOutputAsSpent function verifies that the message is legitimate, ensures that it hasn't been already executed, and marks it as spent. Even if the executeTransaction function is called by a rogue third-party relayer, it shouldn't be able to mess with any of the parameters. This is thanks to the verifications implemented in recordOutputAsSpent.

At last, there's a call to the internal executeTransactionImpl function. If you follow its logic, you'll eventually reach the internal executeBridgeCall function of the Outbox contract. This is the point where the actual message is passed to the Bridge contract.

screenshot of executeBridgeCall function

In turn, the executeCall function of the Bridge contract executes a low-level call to the target.

screenshot of executeCall function

Summarizing these calls:

Sketchy diagram that shows L1 side of the bridge, summarizing the calls in the Outbox and the Bridge contract.

All code I've shown here is in production. You can see it for yourself in Ethereum mainnet. Using any transaction tracer, I encourage you to step through the execution of an L2-to-L1 message from Arbitrum. Read every line of code of these contracts. You can reproduce all my claims so far.

And arrive at the same conclusion I did. This code is not secure. Why ?

Trouble comes in threes

I wasn't surprised to see an external call to the target in the Bridge contract. It had to be there. Still, something felt off. Until it clicked.

I realized that L2-to-L1 messages in Arbitrum have three peculiarities. Despite them being somewhat intertwined, let me try to disentangle them.

As I go, I'll compare them with Optimism's bridge. Because it behaves exactly the opposite.

1. The transaction's success depends on the success of the L2-to-L1 message

I want to be clear on something. A transaction that carries and executes a message is certainly not the same as the message itself. We saw this earlier. Executing the message is just one step of many in the relayer's transaction.

This separation, at least to me, is fundamental. Neither the message's behavior nor its success or failure must put at risk, let alone compromise, the job of whoever is relaying it.

This is not true in Arbitrum.

Look at the executeCall function of the Bridge contract. If the call to the target fails (for any reason), then the success flag is set to false, and the whole transaction reverts.

screenshot of code showing success flag

Optimism's bridge does just the opposite. Checkout the relevant call in the OptimismPortal contract. Below you can see how the code doesn't act on the value of the success flag. It simply logs it.

Unaware of this behavior in Arbitrum, a relayer trying to do their job may attempt to execute a failed L2-to-L1 message again. And again. And again. It is up to the target to decide when the transaction relaying the message can be successfully executed.

This means L2-to-L1 messages in Arbitrum are retryable messages. Which is strange, since in Arbitrum only L1-to-L2 messages are documented as retryable. I guess L2-to-L1 messages should be called that as well.

I wouldn't say this behavior alone got me worried. Although it did smell bad. So I kept digging.

2. The message's execution is not explicitly bounded by gas.

During submission on L2, the user originating the message never specifies the gas limit for L1 execution. Remember the signature for the sendTxToL1 function of ArbSys:

function sendTxToL1(address destination, bytes calldata data)
    external
    payable
    returns (uint256);

Thus, the execution of the message on L1 is not explicitly bounded by a fixed amount of gas. Look at the call from the Bridge to the target:

screenshot of code showing there is no gas limit

What does Optimism do ? Just the opposite. Checkout the relevant call in their OptimismPortal contract:

screenshot od code showing there is gas limit in Optimism

Relayers are the most affected by this lack of explicit gas limits in Arbitrum's bridge.

Because targets can spend as much gas as they want. I know, you could quickly counter-argue this point. Stating that relayers are in control of the transaction's gas limit. So it would be them, and not malicious targets, who dictate how much gas is to be spent.

My counter-counter-argument is that, in practice, generalized relayers may blindly trust the tooling to set whatever gas limit is necessary to execute the transaction (like, using eth_estimateGas). Especially if they're financially incentivized to successfully execute these transactions. Especially if there's no documentation warning them of risks.

I couldn't find open-source code of a production-ready relayer to fully back this claim. Yet, I can argue that if relayers are using Arbitrum's official SDK, by default they're not sensibly setting the gas limit.

In such scenario, it would be the targets those in control of the transactions' gas limit.

Malicious targets could perform griefing attacks on relayers. By running gas-intensive operations on L2-to-L1 messages, they could drain the relayers' funds.

The sensible defense would be estimating gas prior to execution. I'd agree, only if experiences from MEV land didn't suggest otherwise. How sure are you that contracts cannot fingerprint their execution environment and alter their behavior on simulations? If they can, eth_estimateGas or eth_call may not be the most secure choice to simulate arbitrary message passing.

The safety mechanism must be placed in the bridge itself. A fixed gasLimit when calling the target from the bridge is a more effective countermeasure. Just like Optimism does. A relayer could read the message's parameters, and then build reliable, more predictable estimates on the gas costs of any L2-to-L1 message.

Including the gasLimit covers a significant part of the attack surface. It mitigates any type of griefing attack that attempts to consume too much gas in the context of the target. Like a lengthy loop and other mischievous tricks. However, while necessary, a fixed gas limit on the inner call is not sufficient.

There's one more trick under the attackers' sleeve. One that can render the gasLimit fix pointless.

3. The bridge handles returned data

Arbitrum's bridge copies the returned data upon executing the target's code. This data is passed back to the Outbox contract. Depending on the message's success, the data is either logged, or included in an error message.

screenshot of code showing where the Arbitrum bridge copies returned data

The target-controlled data is copied to EVM memory. This happens in the context of the bridge, not the target. By controlling the data's size, a target can still control how much gas is consumed. Even after its code finished running. In spite of whatever fixed gas limit was set when called.

Optimism's bridge knows about this. Hence, it does exactly the opposite. Where Arbitrum copies all returned data, Optimism copies none.

See below how Optimism's bridge uses the call function of the SafeCall library, which executes a low-level call without copying returned data to memory.

screenshot of code showing how the Optimism bridge copies the data

This is the code of SafeCall::call:

screenshot of code showing the safecall library

The comment in Optimism's code is crystal-clear. By not copying return data, they protect against "return bombs". What's this attack ?

The ExcessivelySafeCall repository explains it best:

When bytes are copied from returndata to memory, the memory expansion cost is paid. This means that when using a standard solidity call, the callee can "returnbomb" the caller, imposing an arbitrary gas cost.

There's no defense against return bombs in Arbitrum's bridge. An attacker could trick relayers into paying absurdly high gas costs for a message.

A target that drops dynamic return bombs on callers can look like this:

pragma solidity ^0.8.0;
contract L1Target {
    address private immutable root = msg.sender;
    uint256 private size;

    constructor(uint256 _size) {
        size = _size;
    }

    function setSize(uint256 _size) external {
        require(msg.sender == root);
        size = _size;
    }

    fallback() external {
        assembly { return(0, sload(size.slot)) }
    }
}

Given the lack of documentation, awareness, and unsafe defaults in the tooling, it wouldn't be surprising to find relayers facing this threat. Here's the PoC.

Strength before weakness

For what doth a bridge serve, but to grant passage for messages at will? A message is but a summons to a target mastered by the user, with data handed by the user as well. The heinous to.call(data) doth become inescapable. --- Shakespeare.

Is a bridge fated to fall to all assaults stemming from an external call? Shall all hope be forever lost?

After thousand of pages of thinking so, Kaladin would say NO. We can always choose strength before weakness.

We can mitigate these threats, placing every imaginable defense mechanism to reduce the attack surface and potential of impact of exploits.

There are so many sides to bridge security. Ensuring relayers can reliably operate in a safe environment is one of them. At least in this point, Arbitrum's bridge can provide stronger safety guarantees.

Targets of L2-to-L1 messages can attempt to impose arbitrary execution costs on relayers. It is up for relayers to protect against this undocumented threat. The lack of fixed and explicit gas limits for messages may not allow them to securely estimate transaction costs. Even if such safety measure was in place, it wouldn't be enough. Due to return bombs.

With a return bomb, a malicious target can bypass the gas limit on the internal call. They can control gas consumption in the context of the bridge itself. Targets can change their behavior any time, and even attempt to front-run relayer transactions to craft the attack right before execution of the L2-to-L1 message.

On top of this, the success of the relayers' transactions depend on the success of the messages. They can make transactions relaying L2-to-L1 messages revert due to out of gas exceptions. Relayers would still pay the cost for these failed transactions. Unsophisticated relayers may even attempt to relay failed messages multiple times, incurring into even higher gas costs.

These are the specific measures I shared with Arbitrum to strengthen the bridge:

To defend against targets reverting the call, you can use a low-level call and do not bubble-up the revert in the bridge's context.
To defend against targets that consume lots of gas in their context (e.g., with a loop), remember that the calling context (that is, the bridge) will always have gas left after the external call. Which means you can still control the execution flow and avoid reverting in the bridge's context. If you're worried whatever gas is left will not be enough, you can only forward a specific amount of gas to the external call to make sure there's always some gas left on the bridge's context to successfully finish the transaction. For example, using a gasLimit parameter set in the L2 side when the user originates the message, and then used in the low-level L1 call.
To defend against targets dropping return bombs, you can avoid copying return data at all, or limit the amount of returned data copied.

I shared this to Arbitrum about a month ago via their Immunefi bounty program. I titled the report "Unbounded gas consumption with unhandled failure in bridge call may allow griefing attack and prevent L2-to-L1 message passing". First emphasizing on the return bombs, then expanding into the whole scope of possible attack vectors.

Their team was quick to reply to my claims. After a brief back and forth, the report was disregarded. All I have described is "intended behavior". Which I guess is... fine?

Anyway, at least now we're all aware of the intended message traps in the Arbitrum bridge.