I do not find evidence of NodeInitializationAction to run Dataproc

I specify NodeInitializationAction for Dataproc as follows:

ClusterConfig clusterConfig = new ClusterConfig();
clusterConfig.setGceClusterConfig(...);
clusterConfig.setMasterConfig(...);
clusterConfig.setWorkerConfig(...);
List<NodeInitializationAction> initActions = new ArrayList<>();
NodeInitializationAction action = new NodeInitializationAction();
action.setExecutableFile("gs://mybucket/myExecutableFile");
initActions.add(action);
clusterConfig.setInitializationActions(initActions);

Then later:

Cluster cluster = new Cluster();
cluster.setProjectId("wide-isotope-147019");
cluster.setConfig(clusterConfig);
cluster.setClusterName("cat");

Then, finally, I invoke the dataproc.create operation on the cluster. I see how the cluster is created, but when I go to the master machine ("cat-m" in us-central1-f), I see no evidence that the script I indicated was copied or started.

So this leads to my questions:

  • What should I expect regarding evidence? (edit: I found the script myself in / etc / google -dataproc / startup-scripts / dataproc-initialization-script -0).
  • Where does the script come from? I know that it works as user root, but beyond that, I'm not sure where to find it. I did not find it in the root directory.
  • , Create, "" ""? script, , script ?

.

+4
1

Dataproc init:

  • script : /etc/google-dataproc/startup-scripts/dataproc-initialization-script-0

  • script " " ( , --bucket, Dataproc). , my-cluster, - gcloud compute instances describe my-cluster-m, dataproc-agent-output-directory

  • RUNNING ( ), init . init init -,

  • , , , ,

  • :) Dataproc /var/log/google-dataproc-agent-0.log BootstrapActionRunner

+4

Source: https://habr.com/ru/post/1664607/


All Articles