feat: WAL-based RocksDB replication with HTTP streaming and failover#366
feat: WAL-based RocksDB replication with HTTP streaming and failover#366JackGuslerGit wants to merge 9 commits intomatrix-construct:mainfrom
Conversation
|
this implements async replication, so some dataloss is to be expected after an expected failover (node failure, disk failure process crash, ...), right? |
|
@pschichtel yes, that is correct. Under normal write load, RPO is determined just by network RTT. |
|
Hey @x86pup. It seems the CI failures on this PR are all runner-side cache issues. I am seeing two issues: 1: 2: Can you clean these up and re-run? |
These docker flakes sometimes occur when CI is really busy, apologies! We'll be happy to rerun as necessary. |
|
I haven't had a chance to thoroughly review this yet since I'm currently away, but a few things stand out as suspicious. Foremost it's not clear why WAL streaming is necessary. RocksDB already has internal mechanisms to synchronize primary and secondary; all that's missing is the promotion signalling. What is the basis for concerning ourselves with binary framing of rocksdb inner-workings at the user level? Is the rocksdb synchronization API being invoked here? Perhaps I missed it... |
|
@jevolk Yes, you are correct, In our case, we have a cluster of physical servers where each server has its own local disk. We can't use NFS/shared storage in our infrastructure. So core2 has no direct filesystem access to core1's RocksDB directory. That's the gap we're trying to fill by replicating the WAL and SST files over the network, so core2 can stay in sync with core1 without shared storage. Once core2 has a local copy of the data, Is there a mechanism in RocksDB you'd recommend for this case, or is shared storage assumed in your deployment model? |
Alright so this is not limited to shared filesystem mounts, that's rather exciting actually. Keep up the good work 👍 |
|
@jevolk Thanks! I see it has passed all checks, what's the next step here? |
|
It needs to be thoroughly reviewed here especially since the usage of AI is apparent, Jason is on vacation and will get to it soon. Thank you for ensuring CI passes to help this along. |
|
Okay sounds good, thanks for letting me know! |
|
Thank you for your patience 🙏 I'm right around the corner now... |
This relates to #35.
Summary:
Test plan:
Relevant config options added: