Spark presents yarn as another user

Is it possible to send a spark task to a cluster of yarn and select, either using the command line or inside a jar, which user will “own” the task?

The spark-submit will be run from the script containing the user.

PS: is it possible if the cluster has kerberos configuration (and script a keytab)?

+4
source share
3 answers

For non-kerberized cluster : export HADOOP_USER_NAME=zorrobefore submitting jobs, Spark will do the trick.
Then make sure that unset HADOOP_USER_NAMEif you want to return to the default credentials in the rest of the shell script (or in an interactive shell session).

For a clustered cluster, the clean way to impersonate another account without parsing your other tasks / sessions (which probably depends on your default ticket) will be something on this line ...

export KRB5CCNAME=FILE:/tmp/krb5cc_$(id -u)_temp_$$
kinit -kt ~/.protectedDir/zorro.keytab zorro@MY.REALM
spark-submit ...........
kdestroy
+4
source

If your user exists, you can still run your spark submit su $ my_user -c spark submit [...]

I'm not sure about the kerberos key, but if you do kinit with this user, everything should be fine.

su, , stackoverflow: script

+1

For a non-kerberization cluster, you can add Spark conf as:

--conf spark.yarn.appMasterEnv.HADOOP_USER_NAME=<user_name>
0
source

Source: https://habr.com/ru/post/1659165/


All Articles