I try to use YARN node shortcuts to bind work nodes, but when I run applications on YARN (Spark or just YARN), these applications cannot start.
with Spark, when a --conf spark.yarn.am.nodeLabelExpression="my-label"task is running, the task cannot start (locked to Submitted application [...], see details below).
with the YARN application (for example distributedshell), when specified, the -node_label_expression my-labelapplication cannot start either
Here are the tests that I have done so far.
YARN node tagging
I use Google Dataproc to start my cluster (example: 4 employees, 2 on proactive nodes ). My goal is to make any YARN application wizard work on an unacceptable node , otherwise the node can be disconnected at any time, which makes it difficult to run the application.
I create a cluster using the YARN ( --properties) properties to include node labels:
gcloud dataproc clusters create \
my-dataproc-cluster \
--project [PROJECT_ID] \
--zone [ZONE] \
--master-machine-type n1-standard-1 \
--master-boot-disk-size 10 \
--num-workers 2 \
--worker-machine-type n1-standard-1 \
--worker-boot-disk-size 10 \
--num-preemptible-workers 2 \
--properties 'yarn:yarn.node-labels.enabled=true,yarn:yarn.node-labels.fs-store.root-dir=/system/yarn/node-labels'
Versions of packaged Hadoop and Spark:
- Hadoop Version: 2.8.2
- Spark: 2.2.0
After that, I create a label ( my-label) and update two unsafe workers using this label:
yarn rmadmin -addToClusterNodeLabels "my-label(exclusive=false)"
yarn rmadmin -replaceLabelsOnNode "\
[WORKER_0_NAME].c.[PROJECT_ID].internal=my-label \
[WORKER_1_NAME].c.[PROJECT_ID].internal=my-label"
I see the created shortcut in the YARN web interface:

Spark
When I run a simple example ( SparkPi) without specifying node label information:
spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode client \
/usr/lib/spark/examples/jars/spark-examples.jar \
10
- YARN , <DEFAULT_PARTITION>.root.default.
, spark.yarn.am.nodeLabelExpression, Spark:
spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode client \
--conf spark.yarn.am.nodeLabelExpression="my-label" \
/usr/lib/spark/examples/jars/spark-examples.jar \
10
. YARN :
- YarnApplicationState:
ACCEPTED: waiting for AM container to be allocated, launched and register with RM. - :
Application is Activated, waiting for resources to be assigned for AM. Details : AM Partition = my-label ; Partition Resource = <memory:6144, vCores:2> ; Queue Absolute capacity = 0.0 % ; Queue Absolute used capacity = 0.0 % ; Queue Absolute max capacity = 0.0 % ;
, , ( <DEFAULT_PARTITION, ), :

Used Application Master Resources <memory:1024, vCores:1>, Max Application Master Resources - <memory:0, vCores:0>. , , , .
, :
yarn.scheduler.capacity.root.default.accessible-node-labels=my-label
:
yarn.scheduler.capacity.root.default.accessible-node-labels.my-label.capacity
yarn.scheduler.capacity.root.default.accessible-node-labels.my-label.maximum-capacity
yarn.scheduler.capacity.root.default.accessible-node-labels.my-label.maximum-am-resource-percent
yarn.scheduler.capacity.root.default.accessible-node-labels.my-label.user-limit-factor
yarn.scheduler.capacity.root.default.accessible-node-labels.my-label.minimum-user-limit-percent
.
YARN
YARN :
hadoop jar \
/usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar \
-shell_command "echo ok" \
-jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar \
-queue default \
-node_label_expression my-label
, :
INFO distributedshell.Client: Got application report from ASM for, appId=6, clientToAMToken=null, appDiagnostics= Application is Activated, waiting for resources to be assigned for AM. Details : AM Partition = my-label ; Partition Resource = <memory:6144, vCores:2> ; Queue Absolute capacity = 0.0 % ; Queue Absolute used capacity = 0.0 % ; Queue Absolute max capacity = 0.0 % ; , appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1520354045946, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, [...]
-node_label_expression my-label, <DEFAULT_PARTITION>.root.default .
- - ?
- , Dataproc? , , .
- , ? "" Spark, , .