Skip to content

sync issue for big databases (archive nodes) #215

@kogeler

Description

@kogeler

role: full(archive)
binary: docker pull parity/polkadot:v0.9.42
instance: GCP - t2d-standard-4
disk: GCP - SSD persistent disk
OS: Container-Optimized OS from Google
kernel: 5.10.162+
CLI flags:

--name=${POD_NAME} \
--base-path=/chain-data \
--keystore-path=/keystore \
--chain=${CHAIN} \
--database=paritydb \
--pruning=archive \
--prometheus-external \
--prometheus-port 9615 \
--unsafe-rpc-external \
--unsafe-ws-external \
--rpc-cors=all \
--in-peers 75 \
--out-peers 25 \
--public-addr=/ip4/${EXTERNAL_IP}/tcp/${RELAY_CHAIN_P2P_PORT} \
 --listen-addr=/ip4/0.0.0.0/tcp/30333 \

I'm trying to sync backup nodes from scratch. I have 8 nodes (Kusama, Polkadot, archive, prune, rocksdb, paritydb).

I use the same instances, regions, and CLI flags.
All nodes have 100 peers (in 75/out 25).

2 archive rocksdb nodes (Kusama, Pollkadot) synced in a couple of days.
But 2 archive paritydb nodes (Kusama, Pollkadot) have been syncing for 1,5 weeks. At some point (around 15M blocks), the sync rate decreased quickly. Now it is less than 0 blocks/second. Restars don't help.
It looks like an issue of paritydb.

The current state is:
Kusama - target=#18852634 (100 peers), best: #15387848 (0x117e…c8a0), finalized #15387648 (0x065b…4684), ⬇ 705.8kiB/s ⬆ 461.9kiB/s
Polkadot - target=#16463661 (100 peers), best: #15045441 (0x9fc3…0db0), finalized #15045402 (0xfef9…401b), ⬇ 134.9kiB/s ⬆ 125.2kiB/s

image

The disk sub-system is overloaded: 15k iops and 100MB/s by reading.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions