Use Oldest offset as the initial one when creating the consumer group#428
Use Oldest offset as the initial one when creating the consumer group#428slinkydeveloper wants to merge 1 commit intoknative-extensions:masterfrom
Conversation
Signed-off-by: Francesco Guardiani <[email protected]>
|
|
The following is the coverage report on the affected files.
|
Codecov Report
@@ Coverage Diff @@
## master #428 +/- ##
==========================================
+ Coverage 73.44% 73.45% +0.01%
==========================================
Files 129 129
Lines 5008 5011 +3
==========================================
+ Hits 3678 3681 +3
Misses 1094 1094
Partials 236 236
Continue to review full report at Codecov.
|
I honestly don't know cc @travis-minke-sap
Because the default behaviour doesn't actually reflect the user expectations, which is: if no offset is committed, the consumer group should start reading from the beginning of the partition. In other words, this should be the default behaviour. |
I think it would be good for the distributed and consolidated channels to support the same configuration, and I support the idea of making it a config setting. I prefer that we not change the distributed implementation to default to the oldest offset UNTIL the configuration option has been added so that current users will have the option to maintain the current behavior.
I'm curious as to why the oldest offset should be the default? How do you know that is the user's assumption / expectation? I'm not saying you're wrong - just that both assumptions seem equally valid. Have we heard otherwise from users? I can imagine high-volume use cases where historical data (from days/weeks ago) is not of any use to a subscriber and they just want to start receiving new events.
|
|
After reading associated Issue #420 ; ) This seems like a startup bootstrap edge case that we're just using the initial offset as an easy fix for? Should the dispatcher maybe commit the previous offset (not sure if that's possible) upon startup to put a marker in place as to where it started in case the first event fails?
Again (regardless of whatever the default value might be) I still think making it configurable is a win if we have the time ; ) |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: matzew The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold |
|
|
||
| func NewDispatcher(ctx context.Context, args *KafkaDispatcherArgs) (*KafkaDispatcher, error) { | ||
| confTemplate := sarama.NewConfig() | ||
| confTemplate.Consumer.Offsets.Initial = sarama.OffsetOldest |
There was a problem hiding this comment.
what happens on upgrades (e.g. from 0.21 to 0.22), when this changes?
Does it have side effects.
I shot too quick w/ the LGTM
There was a problem hiding this comment.
Ok I'm thinking about this again. Because our consumer groups are per subscription, when a new subscription starts, using
I'm pretty sure you can't commit -1 as offset.
It's still configurable this way, you can change it in
That sounds like a good idea |
right - it would start every new subscription from the beginning of the channel (kafka topic/log)
Yeah, sorry - I didn't mean commit
Yeah, agreed - I only meant that we need to implement the configurability before we change the "default" (at least for distributed implementation) rather than change the default and maybe later get around to making it configurable ; ). cart/horse vs horse/cart is all - haha
👍 |
This would be -1 when the topic is new (like in the example of #420)
This change doesn't affect distributed impl, right? Are you worried it might cause inconsistencies between the 2 impls? |
Yeah, I assume we'd be able to handle the 0 case to not subtract 1 - just an idea - same issue arises if we determine the current offset from the timestamp - it might be 0 as well.
Correct - this PR is not affecting the distributed implementation and I (as a user of the distributed channel) am not affected by this change and am fine with it proceeding. The comments are just around @aliok 's question of how to do this in a consistent way for both channels. If there is urgency for making this change in the consolidated channel, and then later we want to add configurability to distributed/both and change the distributed default value, - thats ok with me ; ). The only downside (as you mention) is the difference between the two channels for that interim time period. |
devguyio
left a comment
There was a problem hiding this comment.
I think this is not correct. @slinkydeveloper as you've said, this means that a subscription can end up receiving events that are days before its creation and that's not correct.
|
Ok then I'll close it and we can tell the user to use the config-kafka map to fix its corner case |
Signed-off-by: Francesco Guardiani [email protected]
Fixes #420
Proposed Changes
Release Note
Docs