Skip to content

Commit 48d7418

Browse files
author
Andrew Or
committed
Update dynamic allocation docs
1 parent a9a6b80 commit 48d7418

File tree

1 file changed

+26
-28
lines changed

1 file changed

+26
-28
lines changed

docs/job-scheduling.md

Lines changed: 26 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -56,36 +56,31 @@ provide another approach to share RDDs.
5656

5757
## Dynamic Resource Allocation
5858

59-
Spark 1.2 introduces the ability to dynamically scale the set of cluster resources allocated to
60-
your application up and down based on the workload. This means that your application may give
61-
resources back to the cluster if they are no longer used and request them again later when there
62-
is demand. This feature is particularly useful if multiple applications share resources in your
63-
Spark cluster. If a subset of the resources allocated to an application becomes idle, it can be
64-
returned to the cluster's pool of resources and acquired by other applications. In Spark, dynamic
65-
resource allocation is performed on the granularity of the executor and can be enabled through
66-
`spark.dynamicAllocation.enabled`.
67-
68-
This feature is currently disabled by default and available only on [YARN](running-on-yarn.html).
69-
A future release will extend this to [standalone mode](spark-standalone.html) and
70-
[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes). Note that although Spark on
71-
Mesos already has a similar notion of dynamic resource sharing in fine-grained mode, enabling
72-
dynamic allocation allows your Mesos application to take advantage of coarse-grained low-latency
73-
scheduling while sharing cluster resources efficiently.
59+
Spark provides a mechanism to dynamically adjust the resources your application occupies based
60+
on the workload. This means that your application may give resources back to the cluster if they
61+
are no longer used and request them again later when there is demand. This feature is particularly
62+
useful if multiple applications share resources in your Spark cluster.
63+
64+
This feature is disabled by default and available on all coarse-grained cluster managers, i.e.
65+
[standalone mode](spark-standalone.html), [YARN mode](running-on-yarn.html), and
66+
[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes).
7467

7568
### Configuration and Setup
7669

77-
All configurations used by this feature live under the `spark.dynamicAllocation.*` namespace.
78-
To enable this feature, your application must set `spark.dynamicAllocation.enabled` to `true`.
79-
Other relevant configurations are described on the
80-
[configurations page](configuration.html#dynamic-allocation) and in the subsequent sections in
81-
detail.
70+
There are two requirements for using this feature. First, your application must set
71+
`spark.dynamicAllocation.enabled` to `true`. Second, you must set up an *external shuffle service*
72+
on each worker node in the same cluster and set `spark.shuffle.service.enabled` to true in your
73+
application. The purpose of the external shuffle service is to allow executors to be removed
74+
without deleting shuffle files written by them (more detail described
75+
[below](job-scheduling.html#graceful-decommission-of-executors)). The way to set up this service
76+
varies across cluster managers:
77+
78+
In standalone mode, simply start your workers with `spark.shuffle.service.enabled` set to `true`.
8279

83-
Additionally, your application must use an external shuffle service. The purpose of the service is
84-
to preserve the shuffle files written by executors so the executors can be safely removed (more
85-
detail described [below](job-scheduling.html#graceful-decommission-of-executors)). To enable
86-
this service, set `spark.shuffle.service.enabled` to `true`. In YARN, this external shuffle service
87-
is implemented in `org.apache.spark.yarn.network.YarnShuffleService` that runs in each `NodeManager`
88-
in your cluster. To start this service, follow these steps:
80+
In Mesos coarse-grained mode, run `$SPARK_HOME/sbin/start-mesos-shuffle-service.sh` on all
81+
slave nodes with `spark.shuffle.service.enabled` set to `true`.
82+
83+
In YARN mode, start the shuffle service on each `NodeManager` as follows:
8984

9085
1. Build Spark with the [YARN profile](building-spark.html). Skip this step if you are using a
9186
pre-packaged distribution.
@@ -95,10 +90,13 @@ pre-packaged distribution.
9590
2. Add this jar to the classpath of all `NodeManager`s in your cluster.
9691
3. In the `yarn-site.xml` on each node, add `spark_shuffle` to `yarn.nodemanager.aux-services`,
9792
then set `yarn.nodemanager.aux-services.spark_shuffle.class` to
98-
`org.apache.spark.network.yarn.YarnShuffleService`. Additionally, set all relevant
99-
`spark.shuffle.service.*` [configurations](configuration.html).
93+
`org.apache.spark.network.yarn.YarnShuffleService` and `spark.shuffle.service.enabled` to true.
10094
4. Restart all `NodeManager`s in your cluster.
10195

96+
All other relevant configurations are optional and under the `spark.dynamicAllocation.*` and
97+
`spark.shuffle.service.*` namespaces. For more detail, see the
98+
[configurations page](configuration.html#dynamic-allocation).
99+
102100
### Resource Allocation Policy
103101

104102
At a high level, Spark should relinquish executors when they are no longer used and acquire

0 commit comments

Comments
 (0)