@@ -56,36 +56,31 @@ provide another approach to share RDDs.
5656
5757## Dynamic Resource Allocation
5858
59- Spark 1.2 introduces the ability to dynamically scale the set of cluster resources allocated to
60- your application up and down based on the workload. This means that your application may give
61- resources back to the cluster if they are no longer used and request them again later when there
62- is demand. This feature is particularly useful if multiple applications share resources in your
63- Spark cluster. If a subset of the resources allocated to an application becomes idle, it can be
64- returned to the cluster's pool of resources and acquired by other applications. In Spark, dynamic
65- resource allocation is performed on the granularity of the executor and can be enabled through
66- ` spark.dynamicAllocation.enabled ` .
67-
68- This feature is currently disabled by default and available only on [ YARN] ( running-on-yarn.html ) .
69- A future release will extend this to [ standalone mode] ( spark-standalone.html ) and
70- [ Mesos coarse-grained mode] ( running-on-mesos.html#mesos-run-modes ) . Note that although Spark on
71- Mesos already has a similar notion of dynamic resource sharing in fine-grained mode, enabling
72- dynamic allocation allows your Mesos application to take advantage of coarse-grained low-latency
73- scheduling while sharing cluster resources efficiently.
59+ Spark provides a mechanism to dynamically adjust the resources your application occupies based
60+ on the workload. This means that your application may give resources back to the cluster if they
61+ are no longer used and request them again later when there is demand. This feature is particularly
62+ useful if multiple applications share resources in your Spark cluster.
63+
64+ This feature is disabled by default and available on all coarse-grained cluster managers, i.e.
65+ [ standalone mode] ( spark-standalone.html ) , [ YARN mode] ( running-on-yarn.html ) , and
66+ [ Mesos coarse-grained mode] ( running-on-mesos.html#mesos-run-modes ) .
7467
7568### Configuration and Setup
7669
77- All configurations used by this feature live under the ` spark.dynamicAllocation.* ` namespace.
78- To enable this feature, your application must set ` spark.dynamicAllocation.enabled ` to ` true ` .
79- Other relevant configurations are described on the
80- [ configurations page] ( configuration.html#dynamic-allocation ) and in the subsequent sections in
81- detail.
70+ There are two requirements for using this feature. First, your application must set
71+ ` spark.dynamicAllocation.enabled ` to ` true ` . Second, you must set up an * external shuffle service*
72+ on each worker node in the same cluster and set ` spark.shuffle.service.enabled ` to true in your
73+ application. The purpose of the external shuffle service is to allow executors to be removed
74+ without deleting shuffle files written by them (more detail described
75+ [ below] ( job-scheduling.html#graceful-decommission-of-executors ) ). The way to set up this service
76+ varies across cluster managers:
77+
78+ In standalone mode, simply start your workers with ` spark.shuffle.service.enabled ` set to ` true ` .
8279
83- Additionally, your application must use an external shuffle service. The purpose of the service is
84- to preserve the shuffle files written by executors so the executors can be safely removed (more
85- detail described [ below] ( job-scheduling.html#graceful-decommission-of-executors ) ). To enable
86- this service, set ` spark.shuffle.service.enabled ` to ` true ` . In YARN, this external shuffle service
87- is implemented in ` org.apache.spark.yarn.network.YarnShuffleService ` that runs in each ` NodeManager `
88- in your cluster. To start this service, follow these steps:
80+ In Mesos coarse-grained mode, run ` $SPARK_HOME/sbin/start-mesos-shuffle-service.sh ` on all
81+ slave nodes with ` spark.shuffle.service.enabled ` set to ` true ` .
82+
83+ In YARN mode, start the shuffle service on each ` NodeManager ` as follows:
8984
90851 . Build Spark with the [ YARN profile] ( building-spark.html ) . Skip this step if you are using a
9186pre-packaged distribution.
@@ -95,10 +90,13 @@ pre-packaged distribution.
95902 . Add this jar to the classpath of all ` NodeManager ` s in your cluster.
96913 . In the ` yarn-site.xml ` on each node, add ` spark_shuffle ` to ` yarn.nodemanager.aux-services ` ,
9792then set ` yarn.nodemanager.aux-services.spark_shuffle.class ` to
98- ` org.apache.spark.network.yarn.YarnShuffleService ` . Additionally, set all relevant
99- ` spark.shuffle.service.* ` [ configurations] ( configuration.html ) .
93+ ` org.apache.spark.network.yarn.YarnShuffleService ` and ` spark.shuffle.service.enabled ` to true.
100944 . Restart all ` NodeManager ` s in your cluster.
10195
96+ All other relevant configurations are optional and under the ` spark.dynamicAllocation.* ` and
97+ ` spark.shuffle.service.* ` namespaces. For more detail, see the
98+ [ configurations page] ( configuration.html#dynamic-allocation ) .
99+
102100### Resource Allocation Policy
103101
104102At a high level, Spark should relinquish executors when they are no longer used and acquire
0 commit comments