Skip to content

balancer: connectivity state aggregation algorithm needs fixing #5458

@easwars

Description

@easwars

The round_robin LB policy's implementation is broken into two pieces:

  1. The base balancer implementation found here, and
  2. Picker implementation specific to round_robin found here

The base balancer implementation uses the connectivity state aggregation logic provided by ConnectivityStateEvaluator:

type ConnectivityStateEvaluator struct {

The algorithm is as follows:

//  - If at least one SubConn in Ready, the aggregated state is Ready;
//  - Else if at least one SubConn in Connecting, the aggregated state is Connecting;
//  - Else if at least one SubConn is TransientFailure, the aggregated state is Transient Failure;
//  - Else if at least one SubConn is Idle, the aggregated state is Idle;
//  - Else there are no subconns and the aggregated state is Transient Failure

The algorithm as defined in the load balancing spec is as follows though:

The policy sets the channel's connectivity state by aggregating the states of the subchannels:

- If any one subchannel is in READY state, the channel's state is READY.
- Otherwise, if there is any subchannel in state CONNECTING, the channel's state is CONNECTING.
- Otherwise, if there is any subchannel in state IDLE, the channel's state is IDLE.
- Otherwise, if all subchannels are in state TRANSIENT_FAILURE, the channel's state is TRANSIENT_FAILURE.

Note that when a given subchannel reports TRANSIENT_FAILURE, it is considered to still be in
TRANSIENT_FAILURE until it successfully reconnects and reports READY. In particular, we ignore 
the transition from TRANSIENT_FAILURE to CONNECTING.

Note that the implemented algorithm gives precedence to IDLE over TRANSIENT_FAILURE. This works fine for round_robin because in round_robin, we push the subConn into CONNECTING as soon as it enters IDLE. But if we want to use this connectivity state aggregation algorithm in other LB policies, IDLE should take precedence over TRANSIENT_FAILURE. For example, this is exactly what we do in weightedtarget:

var aggregatedState connectivity.State

We even have a TODO in weightedtarget to use balancer.ConnectivityStateEvaluator:

// TODO: use balancer.ConnectivityStateEvaluator to calculate the aggregated
We cannot use the latter unless we fix the algorithm implementation.

Also, the c-core implementation of round_robin sets the connectivity state of the subConn to CONNECTING when it enters IDLE because the LB policy starts connecting as soon the subConn enters IDLE. We also do the latter, but we don't do the former.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions