Getting started with storage diffs
Why do storage diffs matter?
Storage diffs are the unsung heroes of the EVM compliant blockchains. They ensure all nodes agree on the current blockchain state, enabling smart contracts to function correctly, saving resources, and enhancing security by detecting unauthorised changes.
Obtaining storage diffs is a complex task that demands considerable computational resources, making it less accessible to the average consumer.
At the same time, they are incredibly difficult to decode making the information they hold not available to the majority of the network users. If you want to know the value at a specific memory location, you can use web3's getStorageAt
function. However, when it comes to tracking storage changes within a block, it's a challenge. You typically only see the values before and after the entire block, which is less than ideal. To make matters more difficult, many memory locations are calculated as hashes of slots and keys, making it difficult to iterate over hash-maps without knowing the key in advance. Here is some more theory on the subject.
Token Flow's Ethereum Data Warehouse features a readily available storage layout for verified contracts, structured for querying with ease. Our datasets include decoded data enriched with semantics - this means that the data set should be easy to use to a regular analyst, not just to smart contract experts.
Here are some examples of what data you can retrieve from the blockchain using our datasets.
Finding a smart contract to decode
We're stating by finding a smart contract that is simple enough to walk you through how we process and store storage diff data.
Decoding 0xc02a...56cc2 or WETH
The Wrapped Ether (WETH) smart contract, identified by the address 0xc02a...56cc2, stands out as the most widely used with almost 230 million interactions. Its simplicity makes it an excellent choice to kickstart our exploration of storage diffs decoding.
To start, we're going to do a simple analysis of the smart contract source code:
The contract does not implement any abstraction and consists of 5 simple variables. There are no structures in this contract.
Starting from the top (reflecting the variable organisation in storage), we observe the following in the provided code excerpt:
name
- type: string, value: Wrapped Ether;symbol
- type: string, value: WETH;decimals
- type: uint8, value: 18;balanceOf
- type: mapping. It is a hashmap from address to uint;allowance
- type: double mapping. It is a double hashmap from pair of addresses to uint;
We're going to use JSON to structure the source code for future decoding.
Note that storage slot number starts at 0, and not at 1 (so the five variables above are slot 0 to 4 in order). Solidity type definitions allow us to know how may bytes a variable occupies in a slot.
Looking at the smart contract source code, we notice that slots 0, 1 and 2 (name, symbol and decimals) are constant values. Let's look at slots 3 ("balanceOf") and 4 ("allowance").
Slot 3 - balanceOf
Location
Slot 3 is a simple map from address to uint. In our raw datasets the mapping
type has a schema: slot[key0].field0
.
We extracted the semantics of the variable "balanceOf" from the contract.
Since the key is already given as an address, it remains unchanged.
The ".0" denotes a potential structure field, but in this particular contract, there are no structures, so this value can be disregarded.
Previous value & current value
Let's left-pad each location with zeros to full 32 bytes (or 64 characters). Values are always trimmed from the end.
Variables in slot 3 have a 32 bytes length - the entire slot has been used. For smaller size, we would trim from the end (i.e. size = 20 => 40 characters)
According to solidity types, uint
is an alias to uint256
. How can we decode it? We used a helper function in python for this:
Interpretation
balanceOf[0x980c...e08f6]
shows how many tokens (the value) a specific address owns. Using slot 1 (symbol) and slot 2 (decimals) we know that this token's symbol is WETH and has 18 decimal places.
The number of tokens belonging to 0x980c...e08f6
changed from 5.099 WETH
to 3.213 WETH
.
Slot 4 - allowance
Location
Slot 4 is a double mapping: address map to address map to uint. In our raw datasets the double mapping
type has a schema: slot[key0].field0[key1].field1
.
We extracted the semantics of the variable "allowance" from the contract.
The ".0" denotes a potential structure field, but in this particular contract, there are no structures, so this value can be disregarded.
Previous value & current value
Using the same process as for slot 3, we will get these values:
Interpretation
allowance[0x9f7...fff55][0xe5c...be4e1]
shows the maximum number of tokens that address1 (key1
) can transfer from address0's (key0
) account.
Using slot 1 (symbol) and slot 2 (decimals) we know that this token's symbol is WETH and has 18 decimal places.
The maximum value of the number of tokens that 0xe5c7...be4e1
can transfer from 0x9f7f...fff55
is 115...457.584 WETH
, the largest possible number on the chain.
Last updated