Skip to content

Conversation

@sharkdtu
Copy link
Contributor

What changes were proposed in this pull request?

When i set the conf spark.streaming.dynamicAllocation.enabled=true, the conf num-executors can not be set. As a result, it will allocate default 2 executors and all receivers will be run on this 2 executors, there may not be redundant cpu cores for tasks. it will stuck all the time.

in my opinion, we should remove unnecessary restrict for streaming dynamic allocation. we can set num-executors and spark.streaming.dynamicAllocation.enabled=true together. when application starts, each receiver will be run on an executor.

How was this patch tested?

Manual test.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't you use spark.streaming.dynamicAllocation.minExecutors?

@sharkdtu
Copy link
Contributor Author

sharkdtu commented Jan 1, 2018

@felixcheung
At the beginning, if numReceivers > totleExecutorCores, there is not cpu cores for batch processing, and ExecutorAllocationManager can't listen metrics of any batches. As a result, it doesn't work.

@felixcheung
Copy link
Member

hmm, that sounds like a different problem, why is numReceivers set to > spark.cores.max?

@sharkdtu
Copy link
Contributor Author

sharkdtu commented Jan 2, 2018

@felixcheung
if you submit spark on yarn with spark.streaming.dynamicAllocation.enabled=true, the num-executors can not be set. So, at the begining, there are only 2(default value) executors.

@jerryshao
Copy link
Contributor

Sorry to chime in. This feature (streaming dynamic allocation) is obsolete and has bugs, users seldom enabled this feature, does it still worth to fix?

@sharkdtu
Copy link
Contributor Author

sharkdtu commented Jan 3, 2018

@jerryshao
if this PR can fix bugs as you said. why not fix it. Or, it should be marked as deprecated.

@jerryshao
Copy link
Contributor

jerryshao commented Jan 3, 2018

I'm not against the fix. My concern is that we've shifted to structured streaming, also this feature (streaming dynamic allocation) is seldom used/tested, this might not be the only issue regarding to it (in dynamic allocation we updated a lot), do we still need to put effort on it?

Just my concern.

@felixcheung
Copy link
Member

felixcheung commented Jan 4, 2018

not saying about this change, but I've used streaming dynamic allocation quite a bit back in the day.

but in this case I think simply is to set spark.streaming.dynamicAllocation.minExecutors instead of num-executors as pointed out above.

@jerryshao
Copy link
Contributor

Originally in Spark dynamic allocation, "spark.executor.instances" and dynamic allocation conf cannot be co-existed, if "spark.executor.instances" is set, dynamic allocation will not be enabled. But this behavior is changed after 2.0.

I think here for streaming dynamic allocation, we'd better keep it consistent with Spark dynamic allocation.

@felixcheung
Copy link
Member

hmm, I didn't know that was changed actually (SPARK-13723)
But it seems to me spark.streaming.dynamicAllocation.minExecutors is still a valid approach. To match the non-streaming behavior would be a bigger change that I'd agree not sure we should take at this point.

@sharkdtu
Copy link
Contributor Author

sharkdtu commented Jan 4, 2018

@felixcheung
Have you ever thought about initial num-executors? Actually, it is default 2 executors when you run spark on yarn. How can you make sure that this 2 executors have enougth cores for receivers at the begining?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@vc60er
Copy link

vc60er commented Apr 13, 2018

by set spark.streaming.dynamicAllocation.minExecutors also has same issue .https://issues.apache.org/jira/browse/SPARK-14788

@felixcheung

@srowen srowen mentioned this pull request May 11, 2018
@asfgit asfgit closed this in 348ddfd May 12, 2018
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
Closes apache#20458
Closes apache#20530
Closes apache#20557
Closes apache#20966
Closes apache#20857
Closes apache#19694
Closes apache#18227
Closes apache#20683
Closes apache#20881
Closes apache#20347
Closes apache#20825
Closes apache#20078

Closes apache#21281
Closes apache#19951
Closes apache#20905
Closes apache#20635

Author: Sean Owen <[email protected]>

Closes apache#21303 from srowen/ClosePRs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants