-
Notifications
You must be signed in to change notification settings - Fork 288
Description
Harmony's Transaction Ordering
In general, each blockchain's implementation uses one of many mechanisms for determining the order of transactions in a block. With proof-of-stake blockchains like Harmony, validators generally follow the blockchain's implementation of transaction ordering (i.e. they do not have the incentive to alter their node to use a different ordering implementation, like Ethereum miners do). Harmony's implementation of transaction ordering is apparently taken from some old geth version:
- The transactions are first pushed to the mempool, grouped by sender. This forms a hashmap keyed by the sender address. The reason why we group by sender is to allow the same sender to send multiple transactions with successive nonces, so that we will guarantee to process the lower nonces first.
- When a validator produces a block, it will first organize the hashmap into a heap (where a heap element is a single sender's txns). The order of elements in the heap is based on the gas price:
harmony/core/types/transaction.go
Line 536 in 9ab1038
func (s TxByPrice) Less(i, j int) bool { return s[i].GasPrice().Cmp(s[j].GasPrice()) > 0 } - Because the gas price is the only ordering criteria, for elements in the heap that have equal gas price, the effective ordering is determined by (but not the same as, due to heap operations) the original ordering of the elements in the hashmap, which is random.
What's Wrong with Random Ordering?
Random ordering, by itself, is not a problem; it may even be desirable because it makes ordering less predictable. However, there are scenarios in which a sophisticated actor can be economically incentivized to make this predictable anyway. One particular example is arbitrage bots:
- Suppose an ordinary user submits a transaction T0 to swap 1000 A for 10000 B that causes sufficient slippage in the liquidity pool to create an arbitrage opportunity (e.g. one may exchange 10 B -> some A via this pool and then use some other pool to swap those A -> 11 B).
- An arbitrage bot sees this, and in response, sends their own transaction T1 (to their custom contract) to take advantage of this arbitrage opportunity. Note that T1 must be executed after T0.
- If there are multiple arbitrageurs, the first arbitrage transaction that executes after T0 will "win" the opportunity: Let's suppose there are two other actors sending transactions T2 and T3. If the transactions are ordered (relative to each other) like T1, T0, T2, T3, then T2 wins the arbitrage opportunity.
The Spamming Problem
Arbitrage is a competitive environment. It is common practice for arbitrage bots to monitor the mempool for pending transactions before they are finalized, and, upon seeing an arbitrage opportunity, insert their own transactions within the same block.
If there were only a single arbitrage bot, it would be trivial to ensure that T1 is after T0, by sending T1 at a lower gas price (assuming T0 is not already at the minimum gas price); however, if there is any competition, this strategy will lose, because sending T2 at gas price equal to T0 would win with 50% probability (since the outcome may be [T2, T0, T1] or [T0, T2, T1]. To increase the probability further, one may send multiple identical transactions (from different senders) so that they occupy more slots in the mempool hashmap; e.g. by sending 100 "copies" (identical except sender) of T2, winning is almost certain.
The first arbitrage bot, as they keep losing, would respond by sending also multiple copies of T1 at the same gas price. If they also send 100 copies, then the probability that T1 is the first arbitrage transaction after T0 is 50%. But if they send 900 copies, then the probability rises to 90%. This is an especially viable strategy in Harmony because of the low gas fees.
This is the leading cause of transaction spamming on Harmony as I have observed.
Arbitrage Bot Spamming in Action
I have put together the following visualization that shows a few blocks full of arbitrage transactions. For each block I have displayed the number of transactions per target contract (displaying only the suffix of the contract and only if count >= 10), and for some top contracts I have assigned arbitrary colors (I'm out of colors to assign already!). Each square indicates a transaction to the contract indicated by that color; gray meaning it's not specifically labelled.
For this particular spamming spree, the winner transaction was https://explorer.harmony.one/tx/0xc876308b136f9dc0e2b325e24dd7201279dc2b274e0c2659ba12428d670805bc (with a profit of ~700 WONE), which was right after the "T0" transaction: https://explorer.harmony.one/tx/0xb7a0d97438f9c8e5cf09587b17399659c77853d08c4aa28b6679205e3e905b32 (buying $100000 worth of JEWELs). The winner is the 0x881cd26c9832064c833352b56dcf4de5e17f45d0 contract, which is colored blue in the above visualization. This contract was created less than 24 hours ago at the time of this writing, and yet has already accumulated more transactions than the explorer can display in 1000 pages, and there are a handful of similar contracts like this.
Why Does This Matter So Much?
Harmony has been suffering from spam issues for quite a while now. The network has been down or slowed frequently. The explorer also suffers from an excessive transaction volume. We need to solve these issues to help Harmony grow in the long term.
OK, So What Can We Do About It?
We should stop using hashmap ordering completely. One idea is to use timestamp as a secondary ordering key (gas price first, then timestamp; the timestamp needs to be of sufficient precision to avoid ties which would fall back again to the hashmap ordering). This was what BSC did: bnb-chain/bsc#269 . See this comment in that thread for some context: bnb-chain/bsc#269 (comment) (quote: "When transaction spamming was viable, BSC was on the verge of collapse will full blocks and forks.") This should be a simple enough change; since transaction ordering is not validated, this may be able to be pushed without a breaking change.
Will This Help?
Changing to gas-price-then-timestamp ordering will disincentivize arbitrage bots from spamming multiple transactions as it will no longer lead to any advantage. The arbitrage bot competition would likely transition to a latency battle, which should put a substantially lower burden on the chain itself. It will of course take time before various bot operators react. Since arbitrageurs are sophisticated actors, it is difficult to predict what new strategies they may come up with, so there's a degree of uncertainty for sure. Nonetheless, it is hard to imagine that spamming at this magnitude would still be useful for anything.
