Ethereum

Lately, I’ve been diving deep into Ethereum. Ethereum is a pretty cool new platform; the idea is to create a decentralized virtual machine. Anyone who wants to use this machine pays for doing so with ‘ether’, and can store data within the machine, run calculations and install ‘contracts’ - basically install functions that can be invoked by other users.

Some describe it as a generalization of Bitcoin, it’s often described as Blockchain 2.0 technology, where instead of just a ‘dead’ ledger, we have a Turing-complete set of operations that can be performed: the ledger is a computer.

Anyway, see links below if you want to read up more about it. Since they are running a testnet on which they also have bug-bounties, I focused on the VM to see if I could find something interesting while also getting to know the platform better.

They’be been quite ambitious; from the get-go having two different implementations - in Go and C++. That’s naturally a goldmine for a bughunter; in order to create a blockchain split, it would be enough to find some operation where the two VM implementations differ.

The Ethereum VM

The Ethereum VM has a number of opcodes, described in the so called Yellow paper, which serves as the basis for both reference implementations. Some of them are basic

  • Arithmetic operations: ADD, MUL, SUB, DIV, SDIV, MOD, SMOD, EXP, NOT, LT, GT, SLT, SGT, EQ, ADDMOD, MULMOD
  • Bitwise operations: AND, OR, XOR, BYTE
  • Cryptographic operations: SHA3
  • Context-operations: ADDRESS, ORIGIN, BALANCE, CALLER,CODESIZE, EXTCODESIZE, GASPRICE, BLOCKHASH, COINBASE, TIMESTAMP, NUMBER, DIFFICULTY, GASLIMIT
  • Data operations: PUSH1-PUSH32, POP, DUP1-DUP16, SWAP1-SWAP16, MLOAD, MSTORE, MSTORE8)
  • Flow operators: JUMP, JUMPI, JUMPDEST, RETURN, CALL, CALLCODE, CREATE, STOP
  • Logging: LOG1 - LOG4
  • Other: SUICIDE

Then we have some more complex ones:

  • CALLVALUE, CALLDATALOAD, CODECOPY, EXTCODECOPY

Here are some of the attack-venues that I checked out, but which didn’t lead me anywhere:

  • Multiple suicide. I found that it was possible to commit suicide multiple times via one “thread”. A contract could commit call suicide on itself several times; however, the kickback is only payed out once. Also, suicide directly in the ‘constructor’ does not cause any problems.
  • Contract ownership, what if it is possible to deploy a contract, but have the private key to that contract? Well, not possible, the contract address is generated with sufficent enthropy to make that unfeasible.
  • Differences in how gas usage is calculated. A minor difference here would split the blockchain immediately.
  • Error handling - if there were any cases where ‘bad input’ was handled differently between the two clients. For example, the input PUSH32 without being followed by 32 bytes of data.

Although I did find a few cases where the geth-client panic:ed due to high memory usage, those crashes were not usable since they required extremely large amounts of gas to execute.

More interesting was the use of JUMPDEST. So, JUMPDEST marks a position within the code that is a valid destination for jumps. This is a building-block for e.g. creating loops in a higher-level language.

For each instruction, the two clients check the command, and calculate the amount of gas required to execute the instruction. If enough gas is present, the command is executed, and the VM moves on to the next instruction.

Both the geth and C++ clients, whenver being given a piece of code to execute, would first go through the code and locate all JUMPDESTs in order to create a map of valid destinations. After that, the code would be executed, one instruction at a time.

The interesting part what that whenever a CALL/CALLCODE occurred, calling a method in the same code block, another VM instance would take this same piece of code, and do the same mapping once again.

So, in essence, this pseudo-code illustrates the problem:

contract Foo{ 
	function bar(int v) returns(bytes32) 
	{
		if(v == 0) {return "Done";}
		return bar(v-1);

		//artifically inserted JMPDESTS x 10 000 goes here 
		// Note: after return is perfectly fine 

		JUMPDEST
		... [10 000 times]
		JUMPDEST
	}
}

The VM implementations have a maximum stack depth of 1024, so the code above would recursively call bar 1024 times. In the meantime, it would build 1024 JUMPDEST-maps.

Since CALL/CALLCODE the operations create a new vm, there will be 1024 maps. Some basic calculations: 1024 * 50 bytes * 10 000 => 512 Mb. (50 bytes is a handwavy estimation of the bytes per entry in the map. Could be off by an order of magnitude in either direction.) So, 512 Mb memory would be used for executing this particular transaction. If even more JMPDESTS can be crammed in there, we’d go into gigabytes.

I reported this to the Ethereum Bug Bounty. They verified that they were actually able to make the Go-client hog 2Gb of memory, and take 18 seconds to execute (which is a bad thing particularly since the block time is 12 seconds):

With the current block gas limit we managed to get a contract in the BC with 15K JUMPDESTs and then calling it in a later block cost around 87625.

We added a test for this attack here https://github.com/ethereum/tests/blob/develop/StateTests/stSpecialTest.json#L2 (and another one later down).

While current gas costs are subject to research and we have to fine-tune them in terms of economics, this definitely count as a DoS as cpp was using 1.2 GB of memory and Go 2 GB for this test, and it took Go 18s to complete it.

We implemented a fix which caches JUMPDESTs in a map for recursive CALLs. This brings down the mem usage of this test to 0.7MB and fraction of a second in CPU time. https://github.com/ethereum/go-ethereum/pull/1150

This earned me 2.5 BTC in reward!

Upcoming

These days (in fact, apparently this Thursday), the next generation Ethereum network is launched; “Frontier” - the first ‘production’ network with real Ethereum and mining rewards. It’ll be interesting to see how everything progresses, but I’m guessing it’ll be even more picked up by the media.

Later on, a few more things are planned; the Mist browser, which basically integrates a web-browser with an Ethereum VM monitor, so web pages can interface with Ethereum. I believe this will be both the big killer app for Ethereum, more or less the public face of Ethereum, and also the component that will lead to the most vulnerabilities and attacks… and bounties, of course.

As Jutta Steiner wrote:

We had set aside an 11-digit satoshi amount to reward people who found bugs in our code. We’ve seen very high quality submissions to our bug bounty program and hunters received corresponding rewards. The bug bounty program is is still running and we need further submissions to use up the allocated budget…

It’s a highly complex project, and very interesting. I urge all bounty hunters out there to have a look and see if you can find some more bugs in there.

Further reading

2015-07-28

tweets

favorites