BTC Relay is a very interesting Ethereum project. It is an Ethereum contract that stores Bitcoin blockheaders. BTC Relay uses these blockheaders to build a mini-version of the Bitcoin blockchain, and can thus be used by Etherum contracts to verify Bitcoin transactions.
Recently, it was announced that BTC Relay was included in the Ethereum Bug Bounty program. I’ve been interested in looking under the hood of BTC relay since Joseph Chows presentation at Devcon 1.
Previously, I have studied the various VM-implementations, and this was my first real in-depth study of a Ethereum service on an application level, as opposed to infrastructure . So naturally, I gave it a go.
There are a few different steps involved in BTC relay.
A relayer submits Bitcoin a block header to the service (see btcrelay.se for one method to submit headers - I’ll mention other ways later). The bitcoin-style header-hash is calculated (m_dblShaFlip), the presence of a parent-block is verified, and difficulty is checked.
No other validation is made on the validity of the actual transactions! This may seem weird upon first glance, but that is how SPV clients work. The block is added to the list of blocks, with references toward it’s ancestors.
Internally, blocks are stored in a dictionary-like construct, where they are indexed by their hash. However, for the typical usecase, we want to check a transaction within a certain block number on the main chain, and verify the presence of particular transaction there.
To enable that, the each block also contain reference to ancestors, more or less like a linked list - but with a twist. Each stored block has 8 references; just like a linked list it has a reference to the immediate parent. But it also has references to the 5th parent, the 25th parent, etc up to the 5^7th.
This allows for fast lookup of a block; in order to find information about a transaction in block 105404
, we can start with the current head, say at 106034
. Instead of iterating 630 times from child to parent, we can instead jump on index 4 (625
blocks back), to block 105409
, use index 1 (5
steps) and wind up with block 105404
with only three iterations.
A total difficulty, or score
is also associated with each block that becomes stored. This is calculated as scorePrevBlock+difficulty
, meaning that any block stored in BTCRelay have an individual accumulated difficulty, the sum total of difficulty of all since genesis.
This can be used to easily determine which chain is the canonical chain: the block with the highest score (accumulated difficulty) is the blockchain head block.
I started looking into the code for relaying transactions. The relay functionality can be used in the following scenario;
processTransaction
method of the BTCSwap contract.The call sequence when BTCRelay is invoked in step 3 is a bit like this:
The relay functionality implementation:
def relayTx(txBytes:str, txIndex, sibling:arr, txBlockHash, contract):
txHash = self.verifyTx(txBytes, txIndex, sibling, txBlockHash, value=msg.value)
if txHash != 0:
returnCode = contract.processTransaction(txBytes, txHash)
log(type=RelayTransaction, txHash, returnCode)
return(returnCode)
log(type=RelayTransaction, 0, ERR_RELAY_VERIFY)
return(ERR_RELAY_VERIFY)
One interesting thing is that if the returnvalue from contract.processTransaction(txBytes, txHash)
is ERR_RELAY_VERIFY
- it is extremely difficult for a caller to distinguish that event from real ERR_RELAY_VERIFY
event.
I don’t know if this is really a vulnerability, or even a bug. In some obscure scenario, this may be bad, but I was hoping for something more juicy.
Looking into the way ancestors are stored, in btcChain.se, I found something interesting:
macro m_saveAncestors($blockHashArg, $hashPrevBlockArg):
with $blockHash = $blockHashArg:
with $hashPrevBlock = $blockHashArg:
self.internalBlock[self.ibIndex] = blockHash
m_setIbIndex(blockHash, self.ibIndex)
self.ibIndex += 1
m_setHeight(blockHash, m_getHeight(hashPrevBlock) + 1)
Do you see it? Let’s zoom in:
macro m_saveAncestors($BLOCK_HASH_ARG, $hashPrevBlockArg):
with $blockHash = $BLOCK_HASH_ARG:
with $hashPrevBlock = $BLOCK_HASH_ARG:
The code completely ignores the parameter hashPrevBlockArg
, instead using blockHashArg
as both current hash and parent hash. This was an outright error, bound to result in weird behaviour.
However, it was more in the category of functional error, not the kind of proper exploitable vulnerability that I was looking for.
BTCRelay consists of 5 files. I had now covered two of them ; btcChain.se and btcrelay.se. Two of them constants.se and btcBúlkStoreHeaders.se did not contain much code at all. With only incentive.se left, I was losing hope of finding any proper security vulnerabilities.
Remember I said above that a “relayer submits Bitcoin a block header to the service”. Now, why would someone voluntarily feed bitcoin-data into Ethereum? There’s a non-trivial cost involved; incurred both for the computation performed and the increased blockchain data storage.
This has been solved by incentives. Whenever a person submits a blockheader, if he uses the method storeBlockWithFee
, two additional pieces of data are associated with the block; a fee
and a feeRecipient
.
Whenever someone wants to have a transaction from block X verified, the caller needs to pay the fee
associated with the block header. Thus, the relayer gets paid every time one of his blocks are used.
To discourage a relayer from setting a too high fee, it’s possible for someone else to buy off the original relayer. When doing so, the new “block header owner” pays the changeRecipientFee
, a fee which is calculated to correspond to 2 times the gas costs of header submission. A condition for being allowed to take over a header in this fashion is that the new owner needs to set a lower fee
than the previous owner.
Thus, incentives serve to ensure that block headers are submitted and that fees adapt to the market. It’s quite elegant.
And here’s where I found a real vulnerabilty:
# if sufficient fee for 'txBlockHash' is provided, pay the feeRecipient
# and return 1. otherwise return 0.
# This does NOT return any funds to incorrect callers
def feePaid(txBlockHash, amountWei):
if msg.value >= amountWei:
if msg.value > 0:
feeRecipient = m_getFeeRecipient(txBlockHash)
if !send(feeRecipient, msg.value):
invalid()
log(type=EthPayment, feeRecipient, msg.value)
return(1)
return(0)
This code checks pays the fee
to the feeRecipient
, returning false if insufficient funds have been provided with the call, and true if payment was successful. However, there is also a third exit-state; the invalid()
operation.
If the send
is not successfull, a VM-error is generated, causing a roll-back of the call. How can send
fail?
There are, afaik, only two possibilities:
When I wrote about the call-stack attack previously, I found a little fun quirk with contracts. I didn’t write about it at the time, since it seemed well-known - it’s documented in the Solidity documentation:
// This contract rejects any Ether sent to it. It is good
// practise to include such a function for every contract
// in order not to loose Ether.
contract Rejector {
function() { throw; }
}
One thing that probably even most developers are unaware of, is that there is no distinction between sending ether and invoking a contract.
In Ethereum, both operations are calls; the former with value
associated, and the latter with data
containing information about which method to invoke, and parameters. Thus, the operation send(feeRecipient, msg.value)
would invoke the default method of the malicious recipient, triggering a throw
which causes send
to return false
, triggering invalid()
.
The feePaid
function is used in two cases;
fee
- remember?), it would be basically be only denial-of-service by clogging the service with expensive headers.verifyTx
calls helperVerifyHash__
which calls feePaid
. This is very interesting! This means that none of the transactions within the header would be verifiable!To understand the impact of this, let’s recap BTCSwap again, but this time let Bob be malicious and use this bug:
feeRecipient
to a Rejector
-instance on the blockchain.Ensure that only regular accounts can be used as feeRecipient
:
msg.origin
instead of msg.sender
in storeBlockWithFee
.storeBlockWithFeeAndRecipient
changeFeeRecipient
to use msg.origin
instead of argumentI also found another minor bug in the incentive-code, namely that an error in the fee
setting can cause erroneous amount and feeRecipient
address. If a user specifies a fee value larger than ffffffffffffffffffffffff
wei, the amount “spills” into the address of feeRecipient
while truncating the fee.
The method storeBlockWithFeeAndRecipient
accepts feeWei
and passes to m_setFeeInfo
without masking.
def storeBlockWithFeeAndRecipient(blockHeaderBytes:str, feeWei, feeRecipient):
beginGas = msg.gas
res = self.storeBlockHeader(blockHeaderBytes)
if res:
blockHash = m_dblShaFlip(blockHeaderBytes)
m_setFeeInfo(blockHash, feeWei, feeRecipient)
This causes the _feeInfo
to be set, using bitwise OR
, implicitly assuming that feeWei
fits within 12 bytes:
macro m_setFeeInfo($blockHash, $feeWei, $feeRecipient):
self.block[$blockHash]._feeInfo = ($feeRecipient * BYTES_12) | $feeWei
An example of values which should reproduce the bug:
address : 000000000000000000000000aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
fee (13b) : 0000000000000000000000000000000000000001000000000000000000000000
address <<12 : aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa000000000000000000000000
(add<<12)|fee: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab000000000000000000000000
getFeeAmount: 0000000000000000000000000000000000000000000000000000000000000000
m_getFeeRecipient: 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab
The consequence would be that a user who specifies a too high feeWei
would lose the fee, since the money becomes sent to a bogus address. I don’t know if this is exploitable, except maybe through some contrived scenario where a user can affect the fee, but not the address, and uses the fee to change the address to his own address.
While the bugs above have been pretty low-level technical glitches, another issue was found at the ‘architectural’ level.
One thing to note for Serpent developers, is that there is no such thing as private methods. All methods are public, the only thing akin to private methods are macros, which really is just code that is expanded at compile-time.
This is important, and easy to forget if you are more used to Solidity. Beside what’s actually listed in the API-documetation, all these functions are calleable from externally:
#grep def *.se
btcBulkStoreHeaders.se:def bulkStoreHeader(headersBytes:str, count):
btcChain.se:def inMainChain(txBlockHash):
btcChain.se:def getBlockHash(blockHeight):
btcChain.se:def fastGetBlockHash(blockHeight):
btcrelay.se:def init():
btcrelay.se:def setInitialParent(blockHash, height, chainWork):
btcrelay.se:def storeBlockHeader(blockHeaderBytes:str):
btcrelay.se:def verifyTx(txBytes:str, txIndex, sibling:arr, txBlockHash):
btcrelay.se:def helperVerifyHash__(txHash:uint256, txIndex, sibling:arr, txBlockHash):
btcrelay.se:def relayTx(txBytes:str, txIndex, sibling:arr, txBlockHash, contract):
btcrelay.se:def getBlockchainHead():
btcrelay.se:def getLastBlockHeight():
btcrelay.se:def getChainWork():
btcrelay.se:def getAverageChainWork():
btcrelay.se:def computeMerkle(txHash, txIndex, sibling:arr):
btcrelay.se:def within6Confirms(txBlockHash):
btcrelay.se:def getBlockHeader(blockHash):
incentive.se:def storeBlockWithFee(blockHeaderBytes:str, feeWei):
incentive.se:def storeBlockWithFeeAndRecipient(blockHeaderBytes:str, feeWei, feeRecipient):
incentive.se:def feePaid(txBlockHash, amountWei):
incentive.se:def changeFeeRecipient(blockHash, feeWei, feeRecipient):
incentive.se:def getFeeRecipient(blockHash):
incentive.se:def getFeeAmount(blockHash):
incentive.se:def getChangeRecipientFee():
incentive.se:def depthCheck(n):
Three of these are triggers the fee
-processing;
relayTx(txBytes:str, txIndex, sibling:arr, txBlockHash, contract)
-> helperVerifyHash__
->feePaid
getBlockHeader
-> feePaid
changeFeeRecipient
-> feePaid
.Would it be possible to stitch together the same functionality as BTCRelay offers, but without paying fees? It turned out that the answer was yes.
The actual verification looks like this:
def helperVerifyHash__(txHash:uint256, txIndex, sibling:arr, txBlockHash):
if !self.feePaid(txBlockHash, m_getFeeAmount(txBlockHash), value=msg.value): # in incentive.se
log(type=VerifyTransaction, txHash, ERR_BAD_FEE)
return(ERR_BAD_FEE)
if self.within6Confirms(txBlockHash):
log(type=VerifyTransaction, txHash, ERR_CONFIRMATIONS)
return(ERR_CONFIRMATIONS)
if !self.inMainChain(txBlockHash):
log(type=VerifyTransaction, txHash, ERR_CHAIN)
return(ERR_CHAIN)
merkle = self.computeMerkle(txHash, txIndex, sibling)
realMerkleRoot = getMerkleRoot(txBlockHash)
if merkle == realMerkleRoot:
log(type=VerifyTransaction, txHash, 1)
return(1)
We can easily create a corresponding malicious function which calls the internal parts of BTCRelay:
def myVerifyHash(txBytes:str, txIndex, sibling:arr, txBlockHash):
txHash = m_dblShaFlip(txBytes)
if relay.within6Confirms(txBlockHash):
log(type=VerifyTransaction, txHash, ERR_CONFIRMATIONS)
return(ERR_CONFIRMATIONS)
if !relay.inMainChain(txBlockHash):
log(type=VerifyTransaction, txHash, ERR_CHAIN)
return(ERR_CHAIN)
merkle = relay.computeMerkle(txHash, txIndex, sibling)
The call to getMerkleRoot
however, is a macro, meaning that it’s not available for calls.
# get the merkle root of '$blockHash'
macro getMerkleRoot($blockHash):
with $addr = ref(self.block[$blockHash]._blockHeader[0]):
flip32Bytes(sload($addr+1) * BYTES_4 + div(sload($addr+2), BYTES_28)) # must use div()
Even worse, we cannot implemement it locally since it loads the full block header, keyed from the block hash. As we saw above, getBlockHeader
is fee
-protected.
There’s a workaround, however, if we switch our method signature and take the 80 byte blockHeader
instead of the blockHash
. The Dapp can fetch the block header from some API, no biggie.
The full leecher implementation could look something like this:
def leechVerifyHash(txBytes:str, txIndex, sibling:arr, txBlock:str)
txBlockHash = m_dblShaFlip(blockHeaderBytes)
txHash = m_dblShaFlip(txBytes)
if !relay.within6Confirms(txBlockHash) &&
relay.inMainChain(txBlockHash) &&
getMerkleRoot(txBlock) == relay.computeMerkle(txHash, txIndex, sibling):
// Yay, verified
// ...
else:
//...
So, using this mechanism, we can obtain the following flow instead:
In my audit of BTCRelay, I found five bugs. Three of them I’d classify as functional flaws;
Whereas two were more clear-cut security vulnerabialities:
The Ethereum bounty page is a bit in flux, and has not yet been updated to reflect the new score from these findings.
The BTCRelay service is quite complex; both implementing Bitcoin SPV functionality along with an incentive scheme; and several kinds of (potentially malicious) actors. I have high regards for the authors of BTC Relay, and think it will be a great tool.
Security in Ethereum contract programming is still pretty unchartered waters, and there’s a lot of groundwork to be done to arrive at best practices. And best practices only go so far - in fact, one of the bugs existed only because of existing best practices dictated to throw
at failed send()
.
2016-03-14