Guides
Hands on! Some examples
In this guide we will show some examples and go through some useful tips for data analyses on TF data. We will assume you have basic understanding of SQL and of the Ethereum ecosystem.
We will focus our examples on the *_diffs
datasets, as they differ from standard indexed data found elsewhere. The rest of the tables contain standard data and we hope they will look familiar to you.
We will have examples covering
Token Balances
Let’s start with a simple question, given a token and a wallet address, we want to query for the history of the token balances for the wallet.
Normally, you would do this by aggregating events data emitted by the token contract. While you can still do that with Token Flow’s data, using storage diffs make it trivial.
Let’s go over the query. We’re getting diffs for the Optimism USDT contract (chain_id=10
). If we look at the contract, we can see that there is a _balances
storage variable. This variable stores the balances for every address.
Specifically, it’s a mapping address->int
. We subset for storage diffs modifying this variable, then we specify that we want only those modifying the value associated to the wallet address we’re interested in. We do this by subsetting on the hashmap_keys
variable, which is an array that contains all the mapping keys associated to the specified storage variable.
Since we know the mapping only takes one argument, we access the 0-th element of the array using the syntax hashmap_keys::$
0`` (notice the backticks around the 0). JSON elements are stored as quoted strings, and the $
symbol before the backtick unquotes the string.
Lastly we make sure to exclude reverted
actions as they’re not useful for this analysis.
We now have the relevant storage diffs. Each storage diff is specified by a storage location, a prev_value
, the value before the diff, and a curr_value
, the value after the diff.
Let’s say we want to get end-of-day balances, therefore we need to get the latest value for each day for the curr_value
variable. Since there can be multiple diffs done at the same timestamp, i.e. in the same block, we need to use another way to order them.
A storage diff is uniquely identified by its block_number
and its order_index
, the order in which it was applied in the block.
With the help of the last_value
window function we are able to retrieve this value.
When using window functions on Singlestore, you have to pay attention to specify the window fame (in this case ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
), otherwise results might not be what you expect due to how the data is sharded.
Every value is stored agnostically as a string, so in this case we make sure to convert it to an integer type. Since balances can take very high values, we use the biggest integer type available in Singlestore, which is DECIMAL(65,0)
.
After adjusting for token decimals, we will have the results readily available!
Hourly Uniswap v3 prices
Now that storage diffs are clearer to you, we can move to a slightly harder example.
Let’s say we are interested in hourly prices of WBTC on Uniswap v3 in USDC units.
The idea is we can look at the pool contract for the pair WBTC-USDC and compute its price.
Normally, without diffs, you would look at the events emitted after every trade, and compute the prices from the exchanged volumes.
But the Uniswap v3 Pool contract already contains a storage slot for the price. More precisely, the slot0
storage variable is a struct - a named array - and the element slot0.qrtPriceX96
contains the price in scaled form. You can refer to this article by Uniswap for the derivation of the actual price.
Let’s see how we can obtain this value with a simple query on the storage_diffs
table:
In the first CTE, we extract the events emitted by the Uniswap Factory contract when a pool is created, with the address of the pool and of the involved tokens. Pay attention to how we access the named fields of the parameters
array: the syntax is column:$name
In the second CTE, we look for a pool for the pair WBTC-USDC, we take the address of the pool, and we keep track if the tokens are defined as token0
or token1
in the pool. In this case, the procedure will get the inverse of the price.
In the third CTE, we query the storage diffs table for the pool contract we obtained and we compute the median value of the prices for every hour.
Pay attention to the syntax to obtain elements of a struct
: after subsetting for the slot0
we use the structs
variable (an array of quoted strings), to access the needed element.
In the final statement, we simply perform the derivation of the price, and adjust for decimals.
Once again, storage diffs helped obtain the result in a very intuitive way, and without the need to perform pre-aggregations on event data.
Total fees earned for a Uniswap v3 Position
We can also look at something that is simply not possible using events data.
When we put tokens into a Uniswap position and act as liquidity providers, we receive fees in return, if we own the liquidity pool NFT. These fees can either be collected by the NFT owner or remain uncollected in the contract.
While collected fees are retrievable from emitted events, there is no easy way to obtain the unclaimed fees without calling a function of the contract.
But there is a storage variable that keeps track of exactly that!
The positions
storage variable is a mapping bytes->struct
, the struct in turn contains another struct: Info
, which contains the integers tokensOwed0
and tokensOwed1
representing the amounts owed each of the tokens in the pool. In this case we have a nested struct, but you will see how accessing nested items looks easy and familiar.
We selected a random market maker with an open position at the time of writing, we access the relevant item by subsetting on the hashmap_keys
variable using the position key, which is just a hash of the owner address, the lower and the upper tick, as in Solidity keccak256(abi.encodePacked(owner, tickLower, tickUpper))
.This can be easily computed beforehand and we leave it outside of the scope of this article. Then we access the nested item in the struct as "Info.tokensOwed0"
and "Info.tokensOwed1"
using the structs
variable.
STARK Token Holding Rate
Lastly, we can look at an airdrop analysis. STARK Token has just been released on Starknet and we can look at how many users in percentage are holding their tokens fully, or have partially or totally transferred.
In this case, we only make use of event data and of a view: starknet_provision_claims
that we abstracted to make it easy to analyze claimed tokens for the STARK token provisions.
The view contains all the information about claimed entries, but also unclaimed ones.
Pay special attention on how we access nested elements (eg. parameters::
value::$
low`` ). In this case, we access the element value
of the parameters
named array, ie. the output of the Transfer
event. value
is a named array itself, and we’re interested in the low
element, which is the number of tokens transferred. Since value
,low
, and from
are reserved keywords in the database, we have to surround them by backticks “”. We also make use of the
$` operator before the last nested element to make sure it is unquoted.
Conclusion
We have shown you some examples to help you unleash the power of blockchain data through the help of storage diffs, an innovative way to look at data. Now it’s your turn! Do you have any use case you’ve struggled to solve with existing tools? Do you want to know how storage diffs can help you tackle it? Hop on our Discord and let us know.
Last updated