Skip to content

Commit 5dff733

Browse files
authored
Merge pull request apache#122 from palantir/branch-2.2.0-palantir4-k8s-release
Resync with k8s
2 parents 9a6987e + 542c043 commit 5dff733

3 files changed

Lines changed: 22 additions & 4 deletions

File tree

docs/running-on-kubernetes.md

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,15 @@ layout: global
33
title: Running Spark on Kubernetes
44
---
55

6-
Support for running on [Kubernetes](https://kubernetes.io/) is available in experimental status. The feature set is
6+
Support for running on [Kubernetes](https://kubernetes.io/docs/whatisk8s/) is available in experimental status. The feature set is
77
currently limited and not well-tested. This should not be used in production environments.
88

9+
## Prerequisites
10+
11+
* You must have a running Kubernetes cluster with access configured to it using [kubectl](https://kubernetes.io/docs/user-guide/prereqs/). If you do not already have a working Kubernetes cluster, you may setup a test cluster on your local machine using [minikube](https://kubernetes.io/docs/getting-started-guides/minikube/).
12+
* You must have appropriate permissions to create and list [pods](https://kubernetes.io/docs/user-guide/pods/), [nodes](https://kubernetes.io/docs/admin/node/) and [services](https://kubernetes.io/docs/user-guide/services/) in your cluster. You can verify that you can list these resources by running `kubectl get nodes`, `kubectl get pods` and `kubectl get svc` which should give you a list of nodes, pods and services (if any) respectively.
13+
* You must have an extracted spark distribution with Kubernetes support, or build one from [source](https://github.com/apache-spark-on-k8s/spark).
14+
915
## Setting Up Docker Images
1016

1117
Kubernetes requires users to supply images that can be deployed into containers within pods. The images are built to
@@ -49,14 +55,23 @@ being contacted at `api_server_url`. If no HTTP protocol is specified in the URL
4955
setting the master to `k8s://example.com:443` is equivalent to setting it to `k8s://https://example.com:443`, but to
5056
connect without SSL on a different port, the master would be set to `k8s://http://example.com:8443`.
5157

58+
59+
If you have a Kubernetes cluster setup, one way to discover the apiserver URL is by executing `kubectl cluster-info`.
60+
61+
> kubectl cluster-info
62+
Kubernetes master is running at http://127.0.0.1:8080
63+
64+
In the above example, the specific Kubernetes cluster can be used with spark submit by specifying
65+
`--master k8s://http://127.0.0.1:8080` as an argument to spark-submit.
66+
5267
Note that applications can currently only be executed in cluster mode, where the driver and its executors are running on
5368
the cluster.
5469

5570
### Dependency Management and Docker Containers
5671

5772
Spark supports specifying JAR paths that are either on the submitting host's disk, or are located on the disk of the
5873
driver and executors. Refer to the [application submission](submitting-applications.html#advanced-dependency-management)
59-
section for details. Note that files specified with the `local` scheme should be added to the container image of both
74+
section for details. Note that files specified with the `local://` scheme should be added to the container image of both
6075
the driver and the executors. Files without a scheme or with the scheme `file://` are treated as being on the disk of
6176
the submitting machine, and are uploaded to the driver running in Kubernetes before launching the application.
6277

@@ -81,7 +96,7 @@ the driver container as a [secret volume](https://kubernetes.io/docs/user-guide/
8196
### Kubernetes Clusters and the authenticated proxy endpoint
8297

8398
Spark-submit also supports submission through the
84-
[local kubectl proxy](https://kubernetes.io/docs/user-guide/connecting-to-applications-proxy/). One can use the
99+
[local kubectl proxy](https://kubernetes.io/docs/user-guide/accessing-the-cluster/#using-kubectl-proxy). One can use the
85100
authenticating proxy to communicate with the api server directly without passing credentials to spark-submit.
86101

87102
The local proxy can be started by running:

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/Client.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,7 @@ private[spark] class Client(
201201
} catch {
202202
case e: Throwable =>
203203
driverServiceManager.handleSubmissionError(e)
204+
throw e
204205
} finally {
205206
Utils.tryLogNonFatalError {
206207
kubernetesResourceCleaner.deleteAllRegisteredResourcesFromKubernetes(kubernetesClient)

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/KubernetesResourceCleaner.scala

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,15 @@ private[spark] class KubernetesResourceCleaner extends Logging {
3939

4040
def deleteAllRegisteredResourcesFromKubernetes(kubernetesClient: KubernetesClient): Unit = {
4141
synchronized {
42-
logInfo(s"Deleting ${resources.size} registered Kubernetes resources:")
42+
val resourceCount = resources.size
43+
logInfo(s"Deleting ${resourceCount} registered Kubernetes resources...")
4344
resources.values.foreach { resource =>
4445
Utils.tryLogNonFatalError {
4546
kubernetesClient.resource(resource).delete()
4647
}
4748
}
4849
resources.clear()
50+
logInfo(s"Deleted ${resourceCount} registered Kubernetes resources.")
4951
}
5052
}
5153
}

0 commit comments

Comments
 (0)