Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
73f2777
initial Driver logic for Hadoop and Kerberos Support
ifilonenko Jun 29, 2018
6069be5
add executors... still need to refactor to use sparkConf exclusivley
ifilonenko Jul 1, 2018
000120f
refactored executor logic preparing for e2e testing
ifilonenko Jul 2, 2018
13b3adc
resolved initial comments
ifilonenko Jul 7, 2018
0939738
merge conflicts
ifilonenko Jul 30, 2018
347536e
Merge branch 'spark-master' into secure-hdfs
ifilonenko Aug 7, 2018
c30ad8c
launching driver with kerberos authentication instead of simple
ifilonenko Aug 7, 2018
1697e74
merge conflicts and addition of security context
ifilonenko Aug 20, 2018
4a000d2
fix dockerfile
ifilonenko Aug 20, 2018
719b059
non-effective attempt to solve null UnixUsername error
ifilonenko Aug 29, 2018
fb9e810
move credential get
ifilonenko Aug 29, 2018
e7935f8
current working solution
ifilonenko Sep 4, 2018
aa3779c
merge conflicts
ifilonenko Sep 4, 2018
32c408c
merge conflicts
ifilonenko Sep 7, 2018
3cf644e
Merge branch 'spark-master' into secure-hdfs
ifilonenko Sep 13, 2018
583a52c
merge conflicts and various additions
ifilonenko Sep 21, 2018
6ae3def
Merge branch 'spark-master' into secure-hdfs
ifilonenko Sep 21, 2018
78953e6
fixes so tests pass
ifilonenko Sep 21, 2018
73f157f
refactor to handle login logic being used in spark-submit
ifilonenko Sep 26, 2018
367e65b
Merge branch 'spark-master' into secure-hdfs
ifilonenko Sep 27, 2018
5f52a1a
resolve comments and add documentation
ifilonenko Sep 27, 2018
6548ef9
resolved comments
ifilonenko Oct 6, 2018
7f72af5
resolved rest of comments
ifilonenko Oct 6, 2018
4ce00a5
small doc addition
ifilonenko Oct 6, 2018
89063fd
fixes to pass kerberos tests
ifilonenko Oct 7, 2018
e303048
resolve comments
ifilonenko Oct 8, 2018
69840a8
resolve comments
ifilonenko Oct 9, 2018
2108154
style and indentation
ifilonenko Oct 9, 2018
a987a70
resolving comments
ifilonenko Oct 9, 2018
e2f8063
hopefully final comment resolution
ifilonenko Oct 9, 2018
f3a0ffb
style issues
ifilonenko Oct 10, 2018
a958920
included new ability to bake krb5.conf into your docker images and no…
ifilonenko Oct 10, 2018
dd95fca
style check
ifilonenko Oct 10, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ private[spark] class SparkSubmit extends Logging {
val targetDir = Utils.createTempDir()

// assure a keytab is available from any place in a JVM
if (clusterManager == YARN || clusterManager == LOCAL || isMesosClient) {
if (clusterManager == YARN || clusterManager == LOCAL || isMesosClient || isKubernetesCluster) {
Copy link
Contributor

@skonto skonto Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check has been restrictive for customers in the past. There are cases where spark submit should not have the file locally and keytab should be mounted as a secret within the cluster for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check can be removed, but I included it since I believed that the keytab shouldn't be stored as a secret for security reasons and should instead be only accessible from the JVM.

if (args.principal != null) {
if (args.keytab != null) {
require(new File(args.keytab).exists(), s"Keytab file: ${args.keytab} does not exist")
Expand Down Expand Up @@ -644,7 +644,8 @@ private[spark] class SparkSubmit extends Logging {
}
}

if (clusterManager == MESOS && UserGroupInformation.isSecurityEnabled) {
if ((clusterManager == MESOS || clusterManager == KUBERNETES)
&& UserGroupInformation.isSecurityEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add more indent

setRMPrincipal(sparkConf)
}

Expand Down Expand Up @@ -755,7 +756,7 @@ private[spark] class SparkSubmit extends Logging {

// [SPARK-20328]. HadoopRDD calls into a Hadoop library that fetches delegation tokens with
// renewer set to the YARN ResourceManager. Since YARN isn't configured in Mesos mode, we
// must trick it into thinking we're YARN.
// must trick it into thinking we're YARN. Same is on for Kubernetes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to make the comment more generic instead of tacking on every resource manager name here...

private def setRMPrincipal(sparkConf: SparkConf): Unit = {
val shortUserName = UserGroupInformation.getCurrentUser.getShortUserName
val key = s"spark.hadoop.${YarnConfiguration.RM_PRINCIPAL}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,51 @@ private[spark] object Config extends Logging {
"Ensure that major Python version is either Python2 or Python3")
.createWithDefault("2")

val KUBERNETES_KERBEROS_SUPPORT =
ConfigBuilder("spark.kubernetes.kerberos.enabled")
.doc("Specify whether your job is a job that will require a Delegation Token to access HDFS")
Copy link
Contributor

@skonto skonto Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think kerberos goes beyond DTs so it shouldnt be specific to that. Also I think you dont need the user to pass that. You just need to call: UserGroupInformation.isSecurityEnabled instead of getting that property from config.

.booleanConf
.createWithDefault(false)

val KUBERNETES_KERBEROS_KEYTAB =
ConfigBuilder("spark.kubernetes.kerberos.keytab")
.doc("Specify the location of keytab " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fits in one line. Probably similar for other confs you're adding.

"for Kerberos in order to access Secure HDFS")
.stringConf
.createOptional

val KUBERNETES_KERBEROS_PRINCIPAL =
ConfigBuilder("spark.kubernetes.kerberos.principal")
.doc("Specify the principal " +
"for Kerberos in order to access Secure HDFS")
.stringConf
.createOptional

val KUBERNETES_KERBEROS_RENEWER_PRINCIPAL =
ConfigBuilder("spark.kubernetes.kerberos.renewer.principal")
.doc("Specify the principal " +
"you wish to renew and retrieve your Kerberos values with")
.stringConf
.createOptional

val KUBERNETES_KERBEROS_DT_SECRET_NAME =
ConfigBuilder("spark.kubernetes.kerberos.tokensecret.name")
.doc("Specify the name of the secret where " +
"your existing delegation token is stored. This removes the need " +
"for the job user to provide any keytab for launching a job")
.stringConf
.createOptional

val KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY =
ConfigBuilder("spark.kubernetes.kerberos.tokensecret.itemkey")
.doc("Specify the item key of the data where " +
"your existing delegation token is stored. This removes the need " +
"for the job user to provide any keytab for launching a job")
.stringConf
.createOptional



val KUBERNETES_AUTH_SUBMISSION_CONF_PREFIX =
"spark.kubernetes.authenticate.submission"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,13 @@ private[spark] object Constants {
val ENV_CLASSPATH = "SPARK_CLASSPATH"
val ENV_DRIVER_BIND_ADDRESS = "SPARK_DRIVER_BIND_ADDRESS"
val ENV_SPARK_CONF_DIR = "SPARK_CONF_DIR"
val ENV_SPARK_USER = "SPARK_USER"
Copy link
Contributor

@skonto skonto Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is for setting the correct user. But I think hadoop libs should pick the correct user like in SparkContext where Utils.getCurrentUserName() is used. In addition I think we should allow the pods to run with any username (eg. with customized images) and we should have a security context per container like: SecurityContextBuilder().withRunAsUser(). In the latter scenario hadoop libraries will pick up that logged in user as well. A common scenario would be integrating the container with host PAM: jupyterhub/jupyterhub#535 https://medium.com/@pawitp/syncing-host-and-container-users-in-docker-39337eff0094 or with LDAP etc.
Btw if you change the user with which the container user needs also to exist on the image when basic authorization is used (unix groups).

// Spark app configs for containers
val SPARK_CONF_VOLUME = "spark-conf-volume"
val SPARK_CONF_DIR_INTERNAL = "/opt/spark/conf"
val SPARK_CONF_FILE_NAME = "spark.properties"
val SPARK_CONF_PATH = s"$SPARK_CONF_DIR_INTERNAL/$SPARK_CONF_FILE_NAME"
val ENV_HADOOP_TOKEN_FILE_LOCATION = "HADOOP_TOKEN_FILE_LOCATION"

// BINDINGS
val ENV_PYSPARK_PRIMARY = "PYSPARK_PRIMARY"
Expand All @@ -81,4 +83,35 @@ private[spark] object Constants {
val KUBERNETES_MASTER_INTERNAL_URL = "https://kubernetes.default.svc"
val DRIVER_CONTAINER_NAME = "spark-kubernetes-driver"
val MEMORY_OVERHEAD_MIN_MIB = 384L

// Hadoop Configuration
val HADOOP_FILE_VOLUME = "hadoop-properties"
val HADOOP_CONF_DIR_PATH = "/etc/hadoop/conf"
val ENV_HADOOP_CONF_DIR = "HADOOP_CONF_DIR"
val HADOOP_CONF_DIR_LOC = "spark.kubernetes.hadoop.conf.dir"
val HADOOP_CONFIG_MAP_SPARK_CONF_NAME =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this constant is very confusing. First of all, can this be renamed to HADOOP_CONFIG_MAP_NAME? Second, looks like spark.kubernetes.executor.hadoopConfigMapName is sufficient.

"spark.kubernetes.hadoop.executor.hadoopConfigMapName"

// Kerberos Configuration
val KERBEROS_DELEGEGATION_TOKEN_SECRET_NAME =
"spark.kubernetes.kerberos.delegation-token-secret-name"
val KERBEROS_KEYTAB_SECRET_NAME =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is used as the name of the secret storing the DT, not the keytab. So please rename it.

"spark.kubernetes.kerberos.key-tab-secret-name"
val KERBEROS_KEYTAB_SECRET_KEY =
"spark.kubernetes.kerberos.key-tab-secret-key"
val KERBEROS_SPARK_USER_NAME =
"spark.kubernetes.kerberos.spark-user-name"
val KERBEROS_SECRET_LABEL_PREFIX =
"hadoop-tokens"
val SPARK_HADOOP_PREFIX = "spark.hadoop."
val HADOOP_SECURITY_AUTHENTICATION =
SPARK_HADOOP_PREFIX + "hadoop.security.authentication"

// Kerberos Token-Refresh Server
val KERBEROS_REFRESH_LABEL_KEY = "refresh-hadoop-tokens"
Copy link
Contributor

@skonto skonto Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment in the design doc. Can we also provide the option for using an existing renewal service like when integrating with an external hadoop cluster where people already have that. AFAIK hadoop libs do the renewal behind the scenes by talking to the appropriate service.
This is the current implementation with mesos for integrating with existing clusters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because our original architecture had the opinion that the renewal service pod will exist as a separate micro-service, that option could be handled by that renewal service. We used this label to detect that this specific secret was to be renewed. But if we wished to use another renewal service via some existing service, we might be able to just grab an Array[Byte] from some DTManager that may exist in their external Hadoop clusters, and store in a secret. Thank you for this note in the design doc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have that option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed but that would be out of the scope of this PR as the renewal service is a separate micro-service (instead of a running thread), and thereby that logic would be housed in a separate PR governing the opinion of the renewal service pods "DT retrieving" protocol

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this label actually being used for anything?

val KERBEROS_REFRESH_LABEL_VALUE = "yes"

// Hadoop credentials secrets for the Spark app.
val SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR = "/mnt/secrets/hadoop-credentials"
val SPARK_APP_HADOOP_SECRET_VOLUME_NAME = "hadoop-secret"
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ import io.fabric8.kubernetes.api.model.{LocalObjectReference, LocalObjectReferen
import org.apache.spark.SparkConf
import org.apache.spark.deploy.k8s.Config._
import org.apache.spark.deploy.k8s.Constants._
import org.apache.spark.deploy.k8s.features.hadoopsteps.HadoopStepsOrchestrator
import org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
import org.apache.spark.deploy.k8s.submit._
import org.apache.spark.internal.config.ConfigEntry

Expand Down Expand Up @@ -59,7 +61,20 @@ private[spark] case class KubernetesConf[T <: KubernetesRoleSpecificConf](
roleSecretNamesToMountPaths: Map[String, String],
roleSecretEnvNamesToKeyRefs: Map[String, String],
roleEnvs: Map[String, String],
sparkFiles: Seq[String]) {
sparkFiles: Seq[String],
hadoopConfDir: Option[String]) {

def getHadoopConfigMapName: String = s"$appResourceNamePrefix-hadoop-config"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid get in the property names (see existing methods in this class).


def getHadoopStepsOrchestrator : Option[HadoopStepsOrchestrator] = hadoopConfDir.map {
hConf => new HadoopStepsOrchestrator(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: hConf => stays together with the {; if it doesn't fit, use a full method declaration with enclosing braces.

sparkConf,
appResourceNamePrefix,
hConf,
getHadoopConfigMapName)}

def getTokenManager : KubernetesHadoopDelegationTokenManager =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange to have the KubernetesConf object return a unit that does work - most of these are properties. This thing should basically behave like a struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The KubernetesHadoopDelegationTokenManager is crucial for doing token retrieval and needs to be done in the Feature steps. As such, must be retrieved via the KubernetesConf. Sadly, I don't see another way given the design of the feature steps.

new KubernetesHadoopDelegationTokenManager

def namespace(): String = sparkConf.get(KUBERNETES_NAMESPACE)

Expand Down Expand Up @@ -111,7 +126,8 @@ private[spark] object KubernetesConf {
mainAppResource: Option[MainAppResource],
mainClass: String,
appArgs: Array[String],
maybePyFiles: Option[String]): KubernetesConf[KubernetesDriverSpecificConf] = {
maybePyFiles: Option[String],
hadoopConfDir: Option[String]): KubernetesConf[KubernetesDriverSpecificConf] = {
val sparkConfWithMainAppJar = sparkConf.clone()
val additionalFiles = mutable.ArrayBuffer.empty[String]
mainAppResource.foreach {
Expand Down Expand Up @@ -171,7 +187,8 @@ private[spark] object KubernetesConf {
driverSecretNamesToMountPaths,
driverSecretEnvNamesToKeyRefs,
driverEnvs,
sparkFiles)
sparkFiles,
hadoopConfDir)
}

def createExecutorConf(
Expand Down Expand Up @@ -214,6 +231,7 @@ private[spark] object KubernetesConf {
executorMountSecrets,
executorEnvSecrets,
executorEnv,
Seq.empty[String])
Seq.empty[String],
None)
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.spark.deploy.k8s.features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is not under the hadoopsteps package?


import io.fabric8.kubernetes.api.model.HasMetadata

import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesRoleSpecificConf, SparkPod}
import org.apache.spark.deploy.k8s.Constants._
import org.apache.spark.deploy.k8s.features.hadoopsteps.HadoopBootstrapUtil
import org.apache.spark.internal.Logging

/**
* This step is responsible for bootstraping the container with ConfigMaps
* containing Hadoop config files mounted as volumes and an ENV variable
* pointed to the mounted file directory. This is run by both the driver
* and executor, as they both require Hadoop config files.
*/
private[spark] class HadoopConfExecutorFeatureStep(
kubernetesConf: KubernetesConf[_ <: KubernetesRoleSpecificConf])
extends KubernetesFeatureConfigStep with Logging{

override def configurePod(pod: SparkPod): SparkPod = {
val maybeHadoopConfDir = kubernetesConf.sparkConf.getOption(HADOOP_CONF_DIR_LOC)
val maybeHadoopConfigMap = kubernetesConf.sparkConf.getOption(HADOOP_CONFIG_MAP_SPARK_CONF_NAME)
require(maybeHadoopConfDir.isDefined && maybeHadoopConfigMap.isDefined,
"Ensure that HADOOP_CONF_DIR is defined")
logInfo("HADOOP_CONF_DIR defined. Mounting Hadoop specific files")
HadoopBootstrapUtil.bootstrapHadoopConfDir(
maybeHadoopConfDir.get,
maybeHadoopConfigMap.get,
kubernetesConf.getTokenManager,
pod)
}

override def getAdditionalPodSystemProperties(): Map[String, String] = Map.empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to address it here but it feels like these methods should have default implementations, given that lots of classes just don't do anything with them.


override def getAdditionalKubernetesResources(): Seq[HasMetadata] = Seq.empty
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.spark.deploy.k8s.features

import scala.collection.JavaConverters._

import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, ContainerBuilder, HasMetadata, PodBuilder}

import org.apache.spark.deploy.k8s.{KubernetesConf, SparkPod}
import org.apache.spark.deploy.k8s.Constants._
import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
import org.apache.spark.deploy.k8s.features.hadoopsteps.{HadoopBootstrapUtil, HadoopConfigSpec, HadoopConfigurationStep}
import org.apache.spark.internal.Logging

/**
* This is the main method that runs the hadoopConfigurationSteps defined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this seems to just say "Runs the configuration steps defined by HadoopStepsOrchestrator" which is a lot shorter.

* by the HadoopStepsOrchestrator. These steps are run to modify the
* SparkPod and Kubernetes Resources using the additive method of the feature steps
*/
private[spark] class HadoopGlobalFeatureDriverStep(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just call this KerberosConfFeatureDriverStep to be consistent with the one for the executor?

kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: params in their own line are double-indented. Also happens in other places.

extends KubernetesFeatureConfigStep with Logging {
private val hadoopTestOrchestrator =
kubernetesConf.getHadoopStepsOrchestrator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fits in previous line.

require(kubernetesConf.hadoopConfDir.isDefined &&
hadoopTestOrchestrator.isDefined, "Ensure that HADOOP_CONF_DIR is defined")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep condition in previous line, break only the message if it doesn't fit.

private val hadoopSteps =
hadoopTestOrchestrator
.map(hto => hto.getHadoopSteps(kubernetesConf.getTokenManager))
.getOrElse(Seq.empty[HadoopConfigurationStep])

var currentHadoopSpec = HadoopConfigSpec(
podVolumes = Seq.empty,
containerEnvs = Seq.empty,
containerVMs = Seq.empty,
configMapProperties = Map.empty[String, String],
dtSecret = None,
dtSecretName = KERBEROS_DELEGEGATION_TOKEN_SECRET_NAME,
dtSecretItemKey = None,
jobUserName = None)

for (nextStep <- hadoopSteps) {
currentHadoopSpec = nextStep.configureHadoopSpec(currentHadoopSpec)
}

override def configurePod(pod: SparkPod): SparkPod = {
val hadoopBasedPod = new PodBuilder(pod.pod)
.editSpec()
.addAllToVolumes(currentHadoopSpec.podVolumes.asJava)
.endSpec()
.build()

val hadoopBasedContainer = new ContainerBuilder(pod.container)
.addAllToEnv(currentHadoopSpec.containerEnvs.asJava)
.addAllToVolumeMounts(currentHadoopSpec.containerVMs.asJava)
.build()

val hadoopBasedSparkPod = HadoopBootstrapUtil.bootstrapHadoopConfDir(
kubernetesConf.hadoopConfDir.get,
kubernetesConf.getHadoopConfigMapName,
kubernetesConf.getTokenManager,
SparkPod(hadoopBasedPod, hadoopBasedContainer))

val maybeKerberosModification =
for {
secretItemKey <- currentHadoopSpec.dtSecretItemKey
userName <- currentHadoopSpec.jobUserName
} yield {
HadoopBootstrapUtil.bootstrapKerberosPod(
currentHadoopSpec.dtSecretName,
secretItemKey,
userName,
hadoopBasedSparkPod)
}
maybeKerberosModification.getOrElse(
HadoopBootstrapUtil.bootstrapSparkUserPod(
kubernetesConf.getTokenManager.getCurrentUser.getShortUserName,
hadoopBasedSparkPod))
}

override def getAdditionalPodSystemProperties(): Map[String, String] = {
val maybeKerberosConfValues =
for {
secretItemKey <- currentHadoopSpec.dtSecretItemKey
userName <- currentHadoopSpec.jobUserName
} yield {
Map(KERBEROS_KEYTAB_SECRET_NAME -> currentHadoopSpec.dtSecretName,
KERBEROS_KEYTAB_SECRET_KEY -> secretItemKey,
KERBEROS_SPARK_USER_NAME -> userName)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: more indent

val resolvedConfValues = maybeKerberosConfValues.getOrElse(
Map(KERBEROS_SPARK_USER_NAME ->
kubernetesConf.getTokenManager.getCurrentUser.getShortUserName)
)
Map(HADOOP_CONFIG_MAP_SPARK_CONF_NAME -> kubernetesConf.getHadoopConfigMapName,
HADOOP_CONF_DIR_LOC -> kubernetesConf.hadoopConfDir.get) ++ resolvedConfValues
}

override def getAdditionalKubernetesResources(): Seq[HasMetadata] = {
val configMap =
new ConfigMapBuilder()
.withNewMetadata()
.withName(kubernetesConf.getHadoopConfigMapName)
.endMetadata()
.addToData(currentHadoopSpec.configMapProperties.asJava)
.build()
Seq(configMap) ++ currentHadoopSpec.dtSecret.toSeq
}
}
Loading