Skip to content

Collators sometimes miss blocks #5349

@JayPavlina

Description

@JayPavlina

Is there an existing issue?

  • I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

  • This is not a support question.

Description of bug

There is a bug introduced in #3308 that causes collators to sometimes miss blocks, causing longer block times and triggering a reorg. You can reproduce the issue by building the parachain template and polkadot, and then run them with zombienet. If you look at the latency screen, you will see something like this:

Screenshot 2024-08-12 at 3 39 24 PM

Undoing everything in #3308 fixes the problem and the collators will no longer periodically miss blocks. It seems to happen whether or not async backing is enabled, but I mostly tested on older versions without it.

In my testing, the bug only occurs if both the relaychain and parachain are using binaries that include that commit. If either one was built from a version before that commit, the collator performs normally.

We experienced this bug on our testnet when upgrading Enjin Blockchain to polkadot sdk v1.9.0. We worked backwards to find the first version that worked correctly. We solved the issue by forking the sdk and undoing the changes in the mentioned PR.

Steps to reproduce

  1. Build the parachain template
  2. Build polkadot
  3. Run them with zombienet
  4. Check the latency and notice long block times periodically

Here is the zombienet config I used:

[settings]
timeout = 1000

[relaychain]
default_command = "path/to/polkadot"
chain = "rococo-local"

    [[relaychain.nodes]]
    name = "Alice"
    validator = true

    [[relaychain.nodes]]
    name = "Bob"
    validator = true

[[parachains]]
id = 1000
name = "parachain-template-node"
cumulus_based = true
add_to_genesis = true
register_para = true

    [[parachains.collators]]
    name = "Alice"
    command = "path/to/template-node"
    args = ["-ldebug"]

    [[parachains.collators]]
    name = "Bob"
    command = "path/to/template-node"
    args = ["-ldebug"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    I10-unconfirmedIssue might be valid, but it's not yet known.I2-bugThe node fails to follow expected behavior.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions