Pause/resume a consumer by topic#67
Conversation
mfontanini
left a comment
There was a problem hiding this comment.
Looks good, just a couple of comments.
src/consumer.cpp
Outdated
| TopicPartitionList matches; | ||
| for (const auto& topic : topics) { | ||
| for (auto& partition : partitions) { | ||
| bool match = equal(topic.begin(), topic.end(), partition.get_topic().begin(), |
There was a problem hiding this comment.
Won't this blow up if topic.size() > partition.get_topic().size()?
There was a problem hiding this comment.
Yeah looks like the signature was improved in C++17 with a fourth end iterator. I'll put a boundary check.
src/consumer.cpp
Outdated
| static_cast<Consumer*>(opaque)->handle_rebalance(error, list); | ||
| } | ||
|
|
||
| TopicPartitionList Consumer::get_matching_partitions(TopicPartitionList&& partitions, |
There was a problem hiding this comment.
Why is this an rvalue reference and not a const ref?
There was a problem hiding this comment.
I move it internally. I don't need to make a copy. The return of get_assignement is passed-in directly as an rvalue. Unless you think you may need to use this function elsewhere in the class...?
There was a problem hiding this comment.
I didn't see the move. But anyway, in that case it should be taken by value, not by rvalue ref. The value will be moved into this function so won't be making any copies.
src/consumer.cpp
Outdated
| pause_partitions(get_assignment()); | ||
| } | ||
|
|
||
| void Consumer::pause_topics(const std::vector<string>& topics) { |
|
I made the subset selector function global since lots of methods take a topic partition list which could be used by topic instead. Therefore instead of providing overloads for all of them (very tedious) the selector by topic is now a util. |
| CPPKAFKA_API TopicPartitionsListPtr make_handle(rd_kafka_topic_partition_list_t* handle); | ||
|
|
||
| // Extracts a partition list subset belonging to the provided topics (case-insensitive) | ||
| CPPKAFKA_API TopicPartitionList make_subset(const TopicPartitionList& partitions, |
There was a problem hiding this comment.
This is okay here but I'm not sure about the name, "make_subset" doesn't really explain what the function does. Maybe find_matches or something like that? As much as this does return a subset of the input, the important thing is how the subset is created and not the fact that it will return a subset.
src/topic_partition_list.cpp
Outdated
| const vector<string>& topics) { | ||
| vector<bool> skip(partitions.size(), false); | ||
| TopicPartitionList subset; | ||
| for (const auto& topic : topics) { |
There was a problem hiding this comment.
If you flip this loop (first over partitions, then over topics), won't you get rid of the skip vector? Is that there in case there's duplicates in topics or am I missing something? Normally I'd use a set/unordered_set for topics as you want to enforce that there's actually no dupes (plus it makes the lookup much simpler once the loop is flipped). Also, even if this is a simple function, a simple test making sure it does what you want would be nice.
|
Agreed, the loop flip did make things simpler. I also added the set instead of the vector and two test cases. |
src/topic_partition_list.cpp
Outdated
| const set<int>& ids) { | ||
| TopicPartitionList subset; | ||
| for (const auto& partition : partitions) { | ||
| for (const auto& id : ids) { |
There was a problem hiding this comment.
This whole loop can be replaced by
if (ids.count(partition.get_partition()) > 0) {
subset.emplace_back(partition);
}I forgot you were doing the case insensitive comparisons for the topic case so you can't do this there, but this is mostly what I was going for with the set suggestion.
There was a problem hiding this comment.
I can put this code for the partition case. Not a bad idea. I think I've seen it elsewhere in your code
|
Pushed some simplification, hopefully I didn't mess anything up as I edited it through the gh editor. I'll merge once build succeeds. |
The base class
KafkaHandleBasecurrently allows pausing/resuming by partitions which has fine granularity. In some use cases, the partitions are not really known or exposed to the application and furthermore, pausing/resuming action is mostly thought conceptually of at the topic level rather than at the partition level. These methods facilitate this.Note: I debated whether or not these should go into the base class as well, but there are a few caveats with the producer.