You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/running-on-kubernetes.md
+18-3Lines changed: 18 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,9 +3,15 @@ layout: global
3
3
title: Running Spark on Kubernetes
4
4
---
5
5
6
-
Support for running on [Kubernetes](https://kubernetes.io/) is available in experimental status. The feature set is
6
+
Support for running on [Kubernetes](https://kubernetes.io/docs/whatisk8s/) is available in experimental status. The feature set is
7
7
currently limited and not well-tested. This should not be used in production environments.
8
8
9
+
## Prerequisites
10
+
11
+
* You must have a running Kubernetes cluster with access configured to it using [kubectl](https://kubernetes.io/docs/user-guide/prereqs/). If you do not already have a working Kubernetes cluster, you may setup a test cluster on your local machine using [minikube](https://kubernetes.io/docs/getting-started-guides/minikube/).
12
+
* You must have appropriate permissions to create and list [pods](https://kubernetes.io/docs/user-guide/pods/), [nodes](https://kubernetes.io/docs/admin/node/) and [services](https://kubernetes.io/docs/user-guide/services/) in your cluster. You can verify that you can list these resources by running `kubectl get nodes`, `kubectl get pods` and `kubectl get svc` which should give you a list of nodes, pods and services (if any) respectively.
13
+
* You must have an extracted spark distribution with Kubernetes support, or build one from [source](https://github.com/apache-spark-on-k8s/spark).
14
+
9
15
## Setting Up Docker Images
10
16
11
17
Kubernetes requires users to supply images that can be deployed into containers within pods. The images are built to
@@ -49,14 +55,23 @@ being contacted at `api_server_url`. If no HTTP protocol is specified in the URL
49
55
setting the master to `k8s://example.com:443` is equivalent to setting it to `k8s://https://example.com:443`, but to
50
56
connect without SSL on a different port, the master would be set to `k8s://http://example.com:8443`.
51
57
58
+
59
+
If you have a Kubernetes cluster setup, one way to discover the apiserver URL is by executing `kubectl cluster-info`.
60
+
61
+
> kubectl cluster-info
62
+
Kubernetes master is running at http://127.0.0.1:8080
63
+
64
+
In the above example, the specific Kubernetes cluster can be used with spark submit by specifying
65
+
`--master k8s://http://127.0.0.1:8080` as an argument to spark-submit.
66
+
52
67
Note that applications can currently only be executed in cluster mode, where the driver and its executors are running on
53
68
the cluster.
54
69
55
70
### Dependency Management and Docker Containers
56
71
57
72
Spark supports specifying JAR paths that are either on the submitting host's disk, or are located on the disk of the
58
73
driver and executors. Refer to the [application submission](submitting-applications.html#advanced-dependency-management)
59
-
section for details. Note that files specified with the `local` scheme should be added to the container image of both
74
+
section for details. Note that files specified with the `local://` scheme should be added to the container image of both
60
75
the driver and the executors. Files without a scheme or with the scheme `file://` are treated as being on the disk of
61
76
the submitting machine, and are uploaded to the driver running in Kubernetes before launching the application.
62
77
@@ -81,7 +96,7 @@ the driver container as a [secret volume](https://kubernetes.io/docs/user-guide/
81
96
### Kubernetes Clusters and the authenticated proxy endpoint
82
97
83
98
Spark-submit also supports submission through the
84
-
[local kubectl proxy](https://kubernetes.io/docs/user-guide/connecting-to-applications-proxy/). One can use the
99
+
[local kubectl proxy](https://kubernetes.io/docs/user-guide/accessing-the-cluster/#using-kubectl-proxy). One can use the
85
100
authenticating proxy to communicate with the api server directly without passing credentials to spark-submit.
Copy file name to clipboardExpand all lines: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/KubernetesResourceCleaner.scala
+3-1Lines changed: 3 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -39,13 +39,15 @@ private[spark] class KubernetesResourceCleaner extends Logging {
0 commit comments