Skip to content

Commit 7ffc00c

Browse files
HeartSaVioRdongjoon-hyun
authored andcommitted
[MINOR][DOC][SS] Correct description of minPartitions in Kafka option
## What changes were proposed in this pull request? `minPartitions` has been used as a hint and relevant method (KafkaOffsetRangeCalculator.getRanges) doesn't guarantee the behavior that partitions will be equal or more than given value. https://github.com/apache/spark/blob/d67b98ea016e9b714bef68feaac108edd08159c9/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetRangeCalculator.scala#L32-L46 This patch makes clear the configuration is a hint, and actual partitions could be less or more. ## How was this patch tested? Just a documentation change. Closes #25332 from HeartSaVioR/MINOR-correct-kafka-structured-streaming-doc-minpartition. Authored-by: Jungtaek Lim (HeartSaVioR) <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent b148bd5 commit 7ffc00c

1 file changed

Lines changed: 4 additions & 2 deletions

File tree

docs/structured-streaming-kafka-integration.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -393,10 +393,12 @@ The following configurations are optional:
393393
<td>int</td>
394394
<td>none</td>
395395
<td>streaming and batch</td>
396-
<td>Minimum number of partitions to read from Kafka.
396+
<td>Desired minimum number of partitions to read from Kafka.
397397
By default, Spark has a 1-1 mapping of topicPartitions to Spark partitions consuming from Kafka.
398398
If you set this option to a value greater than your topicPartitions, Spark will divvy up large
399-
Kafka partitions to smaller pieces.</td>
399+
Kafka partitions to smaller pieces. Please note that this configuration is like a `hint`: the
400+
number of Spark tasks will be **approximately** `minPartitions`. It can be less or more depending on
401+
rounding errors or Kafka partitions that didn't receive any new data.</td>
400402
</tr>
401403
<tr>
402404
<td>groupIdPrefix</td>

0 commit comments

Comments
 (0)