BTC Relay

BTC Relay is a very interesting Ethereum project. It is an Ethereum contract that stores Bitcoin blockheaders. BTC Relay uses these blockheaders to build a mini-version of the Bitcoin blockchain, and can thus be used by Etherum contracts to verify Bitcoin transactions.

Recently, it was announced that BTC Relay was included in the Ethereum Bug Bounty program. I’ve been interested in looking under the hood of BTC relay since Joseph Chows presentation at Devcon 1.

Previously, I have studied the various VM-implementations, and this was my first real in-depth study of a Ethereum service on an application level, as opposed to infrastructure . So naturally, I gave it a go.

BTCRelay internals

There are a few different steps involved in BTC relay.

A relayer submits Bitcoin a block header to the service (see btcrelay.se for one method to submit headers - I’ll mention other ways later). The bitcoin-style header-hash is calculated (m_dblShaFlip), the presence of a parent-block is verified, and difficulty is checked.

No other validation is made on the validity of the actual transactions! This may seem weird upon first glance, but that is how SPV clients work. The block is added to the list of blocks, with references toward it’s ancestors.

Internally, blocks are stored in a dictionary-like construct, where they are indexed by their hash. However, for the typical usecase, we want to check a transaction within a certain block number on the main chain, and verify the presence of particular transaction there.

To enable that, the each block also contain reference to ancestors, more or less like a linked list - but with a twist. Each stored block has 8 references; just like a linked list it has a reference to the immediate parent. But it also has references to the 5th parent, the 25th parent, etc up to the 5^7th.

This allows for fast lookup of a block; in order to find information about a transaction in block 105404, we can start with the current head, say at 106034. Instead of iterating 630 times from child to parent, we can instead jump on index 4 (625 blocks back), to block 105409, use index 1 (5 steps) and wind up with block 105404 with only three iterations.

A total difficulty, or score is also associated with each block that becomes stored. This is calculated as scorePrevBlock+difficulty, meaning that any block stored in BTCRelay have an individual accumulated difficulty, the sum total of difficulty of all since genesis.

This can be used to easily determine which chain is the canonical chain: the block with the highest score (accumulated difficulty) is the blockchain head block.

Relaying

I started looking into the code for relaying transactions. The relay functionality can be used in the following scenario;

  1. Alice and Bob have agreed to use BTCSwap, Alice will buy Ether from Bob. Bob sends his Ether to the BTCSwap service for escrow.
  2. Alice pays Bob in bitcoin over the Bitcoin blockchain. She wants BTCSwap to make note of this, to release the escrow.
  3. Alice calls btcrelay.relayTx() with transaction information and address of BTCSwap contract. BTCRelay verifies the transaction, and invokes the processTransaction method of the BTCSwap contract.
  4. BTCSwap verifies that the caller is the trusted BTCRelay contract instance, and releases the escrow.

The call sequence when BTCRelay is invoked in step 3 is a bit like this: BTCSwap

The relay functionality implementation:

def relayTx(txBytes:str, txIndex, sibling:arr, txBlockHash, contract):
    txHash = self.verifyTx(txBytes, txIndex, sibling, txBlockHash, value=msg.value)
    if txHash != 0:
        returnCode = contract.processTransaction(txBytes, txHash)
        log(type=RelayTransaction, txHash, returnCode)
        return(returnCode)

    log(type=RelayTransaction, 0, ERR_RELAY_VERIFY)
    return(ERR_RELAY_VERIFY)

One interesting thing is that if the returnvalue from contract.processTransaction(txBytes, txHash) is ERR_RELAY_VERIFY - it is extremely difficult for a caller to distinguish that event from real ERR_RELAY_VERIFY event.

I don’t know if this is really a vulnerability, or even a bug. In some obscure scenario, this may be bad, but I was hoping for something more juicy.

Looking into the way ancestors are stored, in btcChain.se, I found something interesting:

macro m_saveAncestors($blockHashArg, $hashPrevBlockArg):
    with $blockHash = $blockHashArg:
        with $hashPrevBlock = $blockHashArg:
            self.internalBlock[self.ibIndex] = blockHash
            m_setIbIndex(blockHash, self.ibIndex)
            self.ibIndex += 1

            m_setHeight(blockHash, m_getHeight(hashPrevBlock) + 1)

Do you see it? Let’s zoom in:

macro m_saveAncestors($BLOCK_HASH_ARG, $hashPrevBlockArg):
    with $blockHash = $BLOCK_HASH_ARG:
        with $hashPrevBlock = $BLOCK_HASH_ARG:

The code completely ignores the parameter hashPrevBlockArg, instead using blockHashArg as both current hash and parent hash. This was an outright error, bound to result in weird behaviour.

However, it was more in the category of functional error, not the kind of proper exploitable vulnerability that I was looking for.

BTCRelay consists of 5 files. I had now covered two of them ; btcChain.se and btcrelay.se. Two of them constants.se and btcBúlkStoreHeaders.se did not contain much code at all. With only incentive.se left, I was losing hope of finding any proper security vulnerabilities.

Incentives

Remember I said above that a “relayer submits Bitcoin a block header to the service”. Now, why would someone voluntarily feed bitcoin-data into Ethereum? There’s a non-trivial cost involved; incurred both for the computation performed and the increased blockchain data storage.

This has been solved by incentives. Whenever a person submits a blockheader, if he uses the method storeBlockWithFee, two additional pieces of data are associated with the block; a fee and a feeRecipient.

Whenever someone wants to have a transaction from block X verified, the caller needs to pay the fee associated with the block header. Thus, the relayer gets paid every time one of his blocks are used.

To discourage a relayer from setting a too high fee, it’s possible for someone else to buy off the original relayer. When doing so, the new “block header owner” pays the changeRecipientFee, a fee which is calculated to correspond to 2 times the gas costs of header submission. A condition for being allowed to take over a header in this fashion is that the new owner needs to set a lower fee than the previous owner.

Thus, incentives serve to ensure that block headers are submitted and that fees adapt to the market. It’s quite elegant.

Vulnerability

And here’s where I found a real vulnerabilty:

# if sufficient fee for 'txBlockHash' is provided, pay the feeRecipient
# and return 1.  otherwise return 0.
# This does NOT return any funds to incorrect callers
def feePaid(txBlockHash, amountWei):
    if msg.value >= amountWei:
        if msg.value > 0:
            feeRecipient = m_getFeeRecipient(txBlockHash)
            if !send(feeRecipient, msg.value):
                invalid()
            log(type=EthPayment, feeRecipient, msg.value)
        return(1)
    return(0) 

This code checks pays the fee to the feeRecipient, returning false if insufficient funds have been provided with the call, and true if payment was successful. However, there is also a third exit-state; the invalid() operation.

If the send is not successfull, a VM-error is generated, causing a roll-back of the call. How can send fail?

There are, afaik, only two possibilities:

  1. The caller wishes it to fail, by depleting the call stack prior to the call - the call stack attack
  2. The recipient wishes it to fail.

When I wrote about the call-stack attack previously, I found a little fun quirk with contracts. I didn’t write about it at the time, since it seemed well-known - it’s documented in the Solidity documentation:

// This contract rejects any Ether sent to it. It is good
// practise to include such a function for every contract
// in order not to loose Ether.
contract Rejector {
    function() { throw; }
}

One thing that probably even most developers are unaware of, is that there is no distinction between sending ether and invoking a contract.

In Ethereum, both operations are calls; the former with value associated, and the latter with data containing information about which method to invoke, and parameters. Thus, the operation send(feeRecipient, msg.value) would invoke the default method of the malicious recipient, triggering a throw which causes send to return false, triggering invalid().

Impact

The feePaid function is used in two cases;

  1. When buying out a header-owner. This means that an owner could set a very high price to use a header, but it would be impossible to buy it and set a lower price. Since the attacker would not actually benefit from this (he can’t receive the fee - remember?), it would be basically be only denial-of-service by clogging the service with expensive headers.
  2. When verifying a transaction. verifyTx calls helperVerifyHash__ which calls feePaid. This is very interesting! This means that none of the transactions within the header would be verifiable!

To understand the impact of this, let’s recap BTCSwap again, but this time let Bob be malicious and use this bug:

  1. Alice and Bob have agreed to use BTCSwap, Alice will buy Ether from Bob. Bob sends his Ether to the BTCSwap service for escrow.
  2. Alice pays Bob in bitcoin over the Bitcoin blockchain. She wants BTCSwap to make note of this, to release the escrow. 2b. Bob relays the Bitcoin header to BTCRelay, setting feeRecipient to a Rejector-instance on the blockchain.
  3. Alice calls btcrelay.relayTx() with transaction information and address of BTCSwap contract. BTCRelay cannot verify the transaction
  4. After enough time has passed, BTCSwap releases the escrow back to Bob, now holding both his Ether and the Bitcon that Alice sent.

Suggested remediation

Ensure that only regular accounts can be used as feeRecipient:

  • Use msg.origin instead of msg.sender in storeBlockWithFee.
  • Remove storeBlockWithFeeAndRecipient
  • Modify changeFeeRecipient to use msg.origin instead of argument

One more bug

I also found another minor bug in the incentive-code, namely that an error in the fee setting can cause erroneous amount and feeRecipient address. If a user specifies a fee value larger than ffffffffffffffffffffffff wei, the amount “spills” into the address of feeRecipient while truncating the fee.

The method storeBlockWithFeeAndRecipient accepts feeWei and passes to m_setFeeInfo without masking.

def storeBlockWithFeeAndRecipient(blockHeaderBytes:str, feeWei, feeRecipient):
    beginGas = msg.gas
    res = self.storeBlockHeader(blockHeaderBytes)
    if res:
        blockHash = m_dblShaFlip(blockHeaderBytes)
        m_setFeeInfo(blockHash, feeWei, feeRecipient)

This causes the _feeInfo to be set, using bitwise OR, implicitly assuming that feeWei fits within 12 bytes:

macro m_setFeeInfo($blockHash, $feeWei, $feeRecipient):
    self.block[$blockHash]._feeInfo = ($feeRecipient * BYTES_12) | $feeWei

An example of values which should reproduce the bug:

address   :       000000000000000000000000aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
fee (13b) :       0000000000000000000000000000000000000001000000000000000000000000

address <<12 :    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa000000000000000000000000
(add<<12)|fee:    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab000000000000000000000000 


getFeeAmount:      0000000000000000000000000000000000000000000000000000000000000000
m_getFeeRecipient:                      0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab

The consequence would be that a user who specifies a too high feeWei would lose the fee, since the money becomes sent to a bogus address. I don’t know if this is exploitable, except maybe through some contrived scenario where a user can affect the fee, but not the address, and uses the fee to change the address to his own address.

Feeless Leeching

While the bugs above have been pretty low-level technical glitches, another issue was found at the ‘architectural’ level.

One thing to note for Serpent developers, is that there is no such thing as private methods. All methods are public, the only thing akin to private methods are macros, which really is just code that is expanded at compile-time.

This is important, and easy to forget if you are more used to Solidity. Beside what’s actually listed in the API-documetation, all these functions are calleable from externally:

#grep def *.se
btcBulkStoreHeaders.se:def bulkStoreHeader(headersBytes:str, count):
btcChain.se:def inMainChain(txBlockHash):
btcChain.se:def getBlockHash(blockHeight):
btcChain.se:def fastGetBlockHash(blockHeight):
btcrelay.se:def init():
btcrelay.se:def setInitialParent(blockHash, height, chainWork):
btcrelay.se:def storeBlockHeader(blockHeaderBytes:str):
btcrelay.se:def verifyTx(txBytes:str, txIndex, sibling:arr, txBlockHash):
btcrelay.se:def helperVerifyHash__(txHash:uint256, txIndex, sibling:arr, txBlockHash):
btcrelay.se:def relayTx(txBytes:str, txIndex, sibling:arr, txBlockHash, contract):
btcrelay.se:def getBlockchainHead():
btcrelay.se:def getLastBlockHeight():
btcrelay.se:def getChainWork():
btcrelay.se:def getAverageChainWork():
btcrelay.se:def computeMerkle(txHash, txIndex, sibling:arr):
btcrelay.se:def within6Confirms(txBlockHash):
btcrelay.se:def getBlockHeader(blockHash):
incentive.se:def storeBlockWithFee(blockHeaderBytes:str, feeWei):
incentive.se:def storeBlockWithFeeAndRecipient(blockHeaderBytes:str, feeWei, feeRecipient):
incentive.se:def feePaid(txBlockHash, amountWei):
incentive.se:def changeFeeRecipient(blockHash, feeWei, feeRecipient):
incentive.se:def getFeeRecipient(blockHash):
incentive.se:def getFeeAmount(blockHash):
incentive.se:def getChangeRecipientFee():
incentive.se:def depthCheck(n):

Three of these are triggers the fee-processing;

  1. relayTx(txBytes:str, txIndex, sibling:arr, txBlockHash, contract) -> helperVerifyHash__->feePaid
  2. getBlockHeader -> feePaid
  3. changeFeeRecipient-> feePaid.

Would it be possible to stitch together the same functionality as BTCRelay offers, but without paying fees? It turned out that the answer was yes.

The actual verification looks like this:

def helperVerifyHash__(txHash:uint256, txIndex, sibling:arr, txBlockHash):
    if !self.feePaid(txBlockHash, m_getFeeAmount(txBlockHash), value=msg.value):  # in incentive.se
        log(type=VerifyTransaction, txHash, ERR_BAD_FEE)
        return(ERR_BAD_FEE)

    if self.within6Confirms(txBlockHash):
        log(type=VerifyTransaction, txHash, ERR_CONFIRMATIONS)
        return(ERR_CONFIRMATIONS)

    if !self.inMainChain(txBlockHash):
        log(type=VerifyTransaction, txHash, ERR_CHAIN)
        return(ERR_CHAIN)

    merkle = self.computeMerkle(txHash, txIndex, sibling)
    realMerkleRoot = getMerkleRoot(txBlockHash)

    if merkle == realMerkleRoot:
        log(type=VerifyTransaction, txHash, 1)
        return(1)

We can easily create a corresponding malicious function which calls the internal parts of BTCRelay:

def myVerifyHash(txBytes:str, txIndex, sibling:arr, txBlockHash):

	txHash = m_dblShaFlip(txBytes)
    if relay.within6Confirms(txBlockHash):
        log(type=VerifyTransaction, txHash, ERR_CONFIRMATIONS)
        return(ERR_CONFIRMATIONS)

    if !relay.inMainChain(txBlockHash):
        log(type=VerifyTransaction, txHash, ERR_CHAIN)
        return(ERR_CHAIN)

    merkle = relay.computeMerkle(txHash, txIndex, sibling)

The call to getMerkleRoot however, is a macro, meaning that it’s not available for calls.

# get the merkle root of '$blockHash'
macro getMerkleRoot($blockHash):
    with $addr = ref(self.block[$blockHash]._blockHeader[0]):
        flip32Bytes(sload($addr+1) * BYTES_4 + div(sload($addr+2), BYTES_28))  # must use div()

Even worse, we cannot implemement it locally since it loads the full block header, keyed from the block hash. As we saw above, getBlockHeader is fee-protected.

There’s a workaround, however, if we switch our method signature and take the 80 byte blockHeader instead of the blockHash. The Dapp can fetch the block header from some API, no biggie.

The full leecher implementation could look something like this:

def leechVerifyHash(txBytes:str, txIndex, sibling:arr, txBlock:str)
			
	txBlockHash = m_dblShaFlip(blockHeaderBytes)
	txHash = m_dblShaFlip(txBytes)

	if  !relay.within6Confirms(txBlockHash) &&
		relay.inMainChain(txBlockHash) &&
		getMerkleRoot(txBlock) == relay.computeMerkle(txHash, txIndex, sibling):
		
			// Yay, verified
			// ...
	else:
		//...

So, using this mechanism, we can obtain the following flow instead:

leech

Conclusion

In my audit of BTCRelay, I found five bugs. Three of them I’d classify as functional flaws;

  • Ambigous returnvalue
  • Error i block storage/lookup
  • Fee input affects recipient address

Whereas two were more clear-cut security vulnerabialities:

  • Ability to deny header validation through Denial-Of-Payment
  • Missing protection against service-leeching

The Ethereum bounty page is a bit in flux, and has not yet been updated to reflect the new score from these findings.

The BTCRelay service is quite complex; both implementing Bitcoin SPV functionality along with an incentive scheme; and several kinds of (potentially malicious) actors. I have high regards for the authors of BTC Relay, and think it will be a great tool.

Security in Ethereum contract programming is still pretty unchartered waters, and there’s a lot of groundwork to be done to arrive at best practices. And best practices only go so far - in fact, one of the bugs existed only because of existing best practices dictated to throw at failed send().

2016-03-14

tweets

favorites