Skip to content

regression: enabling rackawareness causes severe throughput drops #2071

@lizthegrey

Description

@lizthegrey
Versions
Sarama Kafka Go
v1.29.0 v3.0.0 (Confluent 7) v1.17.1

Regression has been bisected to 1aac8e5

See #1927 (comment) for another report than just mine.

Configuration

Pertinent config variables:

	c.conf = conf
	if (c.LD != nil && c.LD.BoolVariationCtx(
		ctx,
		launchdarkly.FlagKafkaRackAwareFollowerFetch,
		types.UserAndTeam{User: nil, Team: nil})) {
		c.rackID = os.Getenv("AZ")
	}
broker.rack=<%= node["ec2"]["availability_zone"] %>
replica.selector.class=<%= node["kafka"]["replica"]["selector"]["class"] %>
default['kafka']['replica']['selector']['class'] = "org.apache.kafka.common.replica.RackAwareReplicaSelector"
Problem Description

When AZ is not populated (due to a bug on our side, hah), or when FlagKafkaRackAwareFollowerFetch is false, things behave normally. But throughput drops by 75% if the flag is set and the Sarama library is at or past 1aac... https://share.getcloudapp.com/nOu54vNq
There is a corresponding lag in timestamps between producer and consumer: https://share.getcloudapp.com/DOu6LBAQ

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions