Skip to content

Conversation

@zsxwing
Copy link
Member

@zsxwing zsxwing commented Jul 24, 2019

What changes were proposed in this pull request?

KafkaOffsetRangeCalculator.getRanges may drop offsets due to round off errors. The test added in this PR is one example.

This PR rewrites the logic in KafkaOffsetRangeCalculator.getRanges to ensure it never drops offsets.

How was this patch tested?

The regression test.

@zsxwing zsxwing changed the title [SPARK-28489]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets [SPARK-28489][SS]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets Jul 24, 2019
val tp = range.topicPartition
val size = range.size
// number of partitions to divvy up this topic partition to
val parts = math.max(math.round(size.toDouble / totalSize * minPartitions.get), 1).toInt
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one ensures we never drop a TopicPartition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ratio calculation looks good, but round seems to generate less partitions. Is there a reason to choose round instead of ceiling?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm seeing the same. Suppose 4 offsetRanges divide 1 partition for each 0.25, then we lost 1. The number of lost partitions may vary.

In other words, if we use ceil, it may overflow the minimum partitions, and the number of exceeding partitions may vary. We don't guarantee for this calculator to return partitions closest to minimum partitions, so it seems OK.

If we really would like to make this strict, we could apply "allocation" - calculating ratio on each offsetRange, and allocate partitions to each offsetRange according to ratio (apply minimum of 1 for safeness), and allocate extra partitions to some offsetRanges if there're remaining partitions. Not sure we would like to deal with complexity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's a hint. And when the number of partitions is less than minPartition, we will try our best to split. Agreed that the option name minPartition is not accurate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, could you update the document instead in a more accurate way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun I think the doc for this method is accurate :

* The number of Spark tasks will be *approximately* `numPartitions`. It can be less or more

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few days ago, minPartitions is added to the documentation for master/branch-2.4 via #25219 .

var startOffset = range.fromOffset
(0 until parts).map { part =>
// Fine to do integer division. Last partition will consume all the round off errors
val thisPartition = remaining / (parts - part)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thisPartition will be the same as remaining for the last part. This will ensure we always get a KafkaOffsetRange ending with range.untilOffset.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-28489][SS]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets [SPARK-28489][SS] Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets Jul 24, 2019
@SparkQA
Copy link

SparkQA commented Jul 24, 2019

Test build #108067 has finished for PR 25237 at commit d2d3e95.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix, @zsxwing . Could you handle the corner case like the following together in this PR? Although this is not a regressionn(master branch is the same), but currently we have less number of partitions than the given minPartitions for some cases. For example, the following passed.

  test("with minPartition = 4") {
    val options = new CaseInsensitiveStringMap(Map("minPartitions" -> "4").asJava)
    val calc = KafkaOffsetRangeCalculator(options)
    assert(
      calc.getRanges(
        fromOffsets = Map(tp1 -> 0, tp2 -> 0, tp3 -> 0),
        untilOffsets = Map(tp1 -> 29, tp2 -> 29, tp3 -> 29)) ==
        Seq(
          KafkaOffsetRange(tp1, 0, 29, None),
          KafkaOffsetRange(tp2, 0, 29, None),
          KafkaOffsetRange(tp3, 0, 29, None)))
  }

@dongjoon-hyun
Copy link
Member

cc @tdas , @HeartSaVioR , @gaborgsomogyi

Also, cc @gatorsmile since this is reported as a blocker issue for 2.4.4.
I'll include this to 2.4.4 release.

offsetRange
}
}
}.filter(_.size > 0)
Copy link
Contributor

@HeartSaVioR HeartSaVioR Jul 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it could be possible, but suppose it could be possible (as we have this, and we are doing integer division), then we still have chance to have less than minPartitions even the calculation on ratio-based distribution is correct.

@HeartSaVioR
Copy link
Contributor

HeartSaVioR commented Jul 24, 2019

Hmm... I'm now reading comment on getRanges. I'm not sure numPartitions is actually minPartitions (so some typos on javadoc - maybe better to fix them here), but if they're same, below comment would say the method doesn't guarantee returning count of partitions is not necessary to be equal or greater than minPartitions.

The number of Spark tasks will be approximately numPartitions. It can be less or more depending on rounding errors or Kafka partitions that didn't receive any new data.

/**
* Calculate the offset ranges that we are going to process this batch. If `minPartitions`
* is not set or is set less than or equal the number of `topicPartitions` that we're going to
* consume, then we fall back to a 1-1 mapping of Spark tasks to Kafka partitions. If
* `numPartitions` is set higher than the number of our `topicPartitions`, then we will split up
* the read tasks of the skewed partitions to multiple Spark tasks.
* The number of Spark tasks will be *approximately* `numPartitions`. It can be less or more
* depending on rounding errors or Kafka partitions that didn't receive any new data.
*
* Empty ranges (`KafkaOffsetRange.size <= 0`) will be dropped.
*/
def getRanges(
fromOffsets: PartitionOffsetMap,
untilOffsets: PartitionOffsetMap,
executorLocations: Seq[String] = Seq.empty): Seq[KafkaOffsetRange] = {

Please ignore my review comments if the javadoc meant it. Looks great.

@SparkQA
Copy link

SparkQA commented Jul 24, 2019

Test build #108129 has finished for PR 25237 at commit c4010a2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you for the fix. For the documentation, let's update it later.
Merged to master/2.4.

dongjoon-hyun pushed a commit that referenced this pull request Jul 26, 2019
… may drop offsets

## What changes were proposed in this pull request?

`KafkaOffsetRangeCalculator.getRanges` may drop offsets due to round off errors. The test added in this PR is one example.

This PR rewrites the logic in `KafkaOffsetRangeCalculator.getRanges` to ensure it never drops offsets.

## How was this patch tested?

The regression test.

Closes #25237 from zsxwing/fix-range.

Authored-by: Shixiong Zhu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit b9c2521)
Signed-off-by: Dongjoon Hyun <[email protected]>
@zsxwing zsxwing deleted the fix-range branch July 26, 2019 07:42
@HeartSaVioR
Copy link
Contributor

#25332 is a follow-up PR to address documentation.

@dongjoon-hyun
Copy link
Member

Thanks, @HeartSaVioR ! It's merged.

rluta pushed a commit to rluta/spark that referenced this pull request Sep 17, 2019
… may drop offsets

## What changes were proposed in this pull request?

`KafkaOffsetRangeCalculator.getRanges` may drop offsets due to round off errors. The test added in this PR is one example.

This PR rewrites the logic in `KafkaOffsetRangeCalculator.getRanges` to ensure it never drops offsets.

## How was this patch tested?

The regression test.

Closes apache#25237 from zsxwing/fix-range.

Authored-by: Shixiong Zhu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit b9c2521)
Signed-off-by: Dongjoon Hyun <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Sep 26, 2019
… may drop offsets

## What changes were proposed in this pull request?

`KafkaOffsetRangeCalculator.getRanges` may drop offsets due to round off errors. The test added in this PR is one example.

This PR rewrites the logic in `KafkaOffsetRangeCalculator.getRanges` to ensure it never drops offsets.

## How was this patch tested?

The regression test.

Closes apache#25237 from zsxwing/fix-range.

Authored-by: Shixiong Zhu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit b9c2521)
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants