|
| 1 | +--- |
| 2 | +title: Frontier Merkle Tree |
| 3 | +--- |
| 4 | + |
| 5 | +The Frontier Merkle Tree is an append only Merkle tree that is optimized for minimal storage on chain. |
| 6 | +By storing only the right-most non-empty node at each level of the tree we can always extend the tree with a new leaf or compute the root without needing to store the entire tree. |
| 7 | +We call these values the frontier of the tree. |
| 8 | +If we have the next index to insert at and the current frontier, we have everything needed to extend the tree or compute the root, with much less storage than a full merkle tree. |
| 9 | +Note that we're not actually keeping track of the data in the tree: we only store what's minimally required in order to be able to compute the root after inserting a new element. |
| 10 | + |
| 11 | +We will go through a few diagrams and explanations to understand how this works. |
| 12 | +And then a pseudo implementation is provided. |
| 13 | + |
| 14 | + |
| 15 | +## Insertion |
| 16 | +Whenever we are inserting, we need to update the "root" of the largest subtree possible. |
| 17 | +This is done by updating the node at the level of the tree, where we have just inserted its right-most descendant. |
| 18 | +This can sound a bit confusing, so we will go through a few examples. |
| 19 | + |
| 20 | +At first, say that we have the following tree, and that it is currently entirely empty. |
| 21 | + |
| 22 | + |
| 23 | + |
| 24 | +### The first leaf |
| 25 | + |
| 26 | +When we are inserting the first leaf (lets call it A), the largest subtree is that leaf value itself (level 0). |
| 27 | +In this case, we simply need to store the leaf value in `frontier[0]` and then we are done. |
| 28 | +For the sake of visualization, we will be drawing the elements in the `frontier` in blue. |
| 29 | + |
| 30 | + |
| 31 | + |
| 32 | +Notice that this will be the case whenever we are inserting a leaf at an even index. |
| 33 | + |
| 34 | +### The second leaf |
| 35 | + |
| 36 | +When we are inserting the second leaf (lets call it B), the largest subtree will not longer be at level 0. |
| 37 | +Instead it will be level 1, since the entire tree below it is now filled! |
| 38 | +Therefore, we will compute the root of this subtree, `H(frontier[0],B)` and store it in `frontier[1]`. |
| 39 | + |
| 40 | +Notice, that we don't need to store the leaf B itself, since we won't be needing it for any future computations. |
| 41 | +This is what makes the frontier tree efficient - we get away with storing very little data. |
| 42 | + |
| 43 | + |
| 44 | + |
| 45 | +### Third leaf |
| 46 | +When inserting the third leaf, we are again back to the largest subtree being filled by the insertion being itself at level 0. |
| 47 | +The update will look similar to the first, where we only update `frontier[0]` with the new leaf. |
| 48 | + |
| 49 | + |
| 50 | + |
| 51 | +### Fourth leaf |
| 52 | + |
| 53 | +When inserting the fourth leaf, things get a bit more interesting. |
| 54 | +Now the largest subtree getting filled by the insertion is at level 2. |
| 55 | + |
| 56 | +To compute the new subtree root, we have to compute `F = H(frontier[0], E)` and then `G = H(frontier[1], F)`. |
| 57 | +G is then stored in `frontier[2]`. |
| 58 | + |
| 59 | + |
| 60 | +As before, notice that we are only updating one value in the frontier. |
| 61 | + |
| 62 | + |
| 63 | +## Figuring out what to update |
| 64 | + |
| 65 | +To figure out which level to update in the frontier, we simply need to figure out what the height is of the largest subtree that is filled by the insertion. |
| 66 | +While this might sound complex, it is actually quite simple. |
| 67 | +Consider the following extension of the diagram. |
| 68 | +We have added the level to update, along with the index of the leaf in binary. |
| 69 | +Seeing any pattern? |
| 70 | + |
| 71 | + |
| 72 | + |
| 73 | +The level to update is simply the number of trailing ones in the binary representation of the index. |
| 74 | +For a binary tree, we have that every `1` in the binary index represents a "right turn" down the tree. |
| 75 | +Walking up the tree from the leaf, we can simply count the number of right turns until we hit a left-turn. |
| 76 | + |
| 77 | +## How to compute the root |
| 78 | + |
| 79 | +Computing the root based on the frontier is also quite simple. |
| 80 | +We can use the last index inserted a leaf at to figure out how high up the frontier we should start. |
| 81 | +Then we know that anything that is at the right of the frontier has not yet been inserted, so all of these values are simply "zeros" values. |
| 82 | +Zeros here are understood as the root for a subtree only containing zeros. |
| 83 | + |
| 84 | +For example, if we take the tree from above and compute the root for it, we would see that level 2 was updated last. |
| 85 | +Meaning that we can simply compute the root as `H(frontier[2], zeros[2])`. |
| 86 | + |
| 87 | + |
| 88 | + |
| 89 | +For cases where we have built further, we simply "walk" up the tree and use either the frontier value or the zero value for the level. |
| 90 | + |
| 91 | +## Pseudo implementation |
| 92 | +```python |
| 93 | +class FrontierTree: |
| 94 | + HEIGHT: immutable(uint256) |
| 95 | + SIZE: immutable(uint256) |
| 96 | + |
| 97 | + frontier: HashMap[uint256, bytes32] # level => node |
| 98 | + zeros: HashMap[uint256, uint256] # level => root of empty subtree of height level |
| 99 | + |
| 100 | + next_index: uint256 = 0 |
| 101 | + |
| 102 | + # Can entirely be removed with optimizations |
| 103 | + def __init__(self, _height_: uint256): |
| 104 | + self.HEIGHT = _height |
| 105 | + self.SIZE = 2**_height |
| 106 | + # Populate zeros |
| 107 | + |
| 108 | + def compute_level(_index: uint256) -> uint256: |
| 109 | + ''' |
| 110 | + We can get the right of the most filled subtree by |
| 111 | + counting the number of trailing ones in the index |
| 112 | + ''' |
| 113 | + count = 0 |
| 114 | + x = _index |
| 115 | + while (x & 1 == 1): |
| 116 | + count += 1 |
| 117 | + x >>= 1 |
| 118 | + return count |
| 119 | + |
| 120 | + def root() -> bytes32: |
| 121 | + ''' |
| 122 | + Compute the root of the tree |
| 123 | + ''' |
| 124 | + if self.next_index == 0: |
| 125 | + return self.zeros[self.HEIGHT] |
| 126 | + elif self.next_index == SIZE: |
| 127 | + return self.frontier[self.HEIGHT] |
| 128 | + else: |
| 129 | + index = self.next_index - 1 |
| 130 | + level = self.compute_level(index) |
| 131 | + |
| 132 | + temp: bytes32 = self.frontier[level] |
| 133 | + |
| 134 | + bits = index >> level |
| 135 | + for i in range(level, self.HEIGHT): |
| 136 | + is_right = bits & 1 == 1 |
| 137 | + if is_right: |
| 138 | + temp = sha256(frontier[i], temp) |
| 139 | + else: |
| 140 | + temp = sha256(temp, self.zeros[i]) |
| 141 | + bits >>= 1 |
| 142 | + return temp |
| 143 | + |
| 144 | + def insert(self, _leaf: bytes32): |
| 145 | + ''' |
| 146 | + Insert a leaf into the tree |
| 147 | + ''' |
| 148 | + level = self.compute_level(next_index) |
| 149 | + right = _leaf |
| 150 | + for i in range(0, level): |
| 151 | + right = sha256(frontier[i], right) |
| 152 | + self.frontier[level] = right |
| 153 | + self.next_index += 1 |
| 154 | +``` |
| 155 | + |
| 156 | +## Optimizations |
| 157 | +- The `zeros` can be pre-computed and stored in the `Inbox` directly, this way they can be shared across all of the trees. |
| 158 | + |
| 159 | + |
0 commit comments