COMPSs - nodes already filled out the error

After submitting the COMPSs application, I received the following error message and the application failed.

MPI_CMD=mpirun -timestamp-output -n 1 -H s00r0 /apps/COMPSs/1.3/Runtime/scripts/user/runcompss --project=/tmp/1668183.tmpdir/project_1458303603.xml --resources=/tmp/1668183.tmpdir/resources_1458303603.xml --uuid=2ed20e6a-9f02-49ff-a71c-e071ce35dacc /apps/FILESPACE/pycompssfile arg1 arg2 : -n 1 -H s00r0 /apps/COMPSs/1.3/Runtime/scripts/system/adaptors/nio/persistent_worker_starter.sh /apps/INTEL/mkl/lib/intel64 null /home/myhome/kmeans_python/src/ true /tmp/1668183.tmpdir 4 5 5 s00r0-ib0 43001 43000 true 1 /apps/COMPSs/1.3/Runtime/scripts/system/2ed20e6a-9f02-49ff-a71c-e071ce35dacc : -n 1 -H s00r0 /apps/COMPSs/1.3/Runtime/scripts/system/adaptors/nio/persistent_worker_starter.sh /apps/INTEL/mkl/lib/intel64 null /home/myhome/kmeans_python/src/ true /tmp/1668183.tmpdir 4 5 5 s00r0-ib0 43001 43000 true 2 /apps/COMPSs/1.3/Runtime/scripts/system/2ed20e6a-9f02-49ff-a71c-e071ce35dacc -------------------------------------------------------------------------- All nodes which are allocated for this job are already filled. -------------------------------------------------------------------------- 

I am using COMPSs 1.3.

Why is this happening?

+5
source share
1 answer

You are trying to run master and worker on the same node. COMPSs 1.3 in a cluster with an NIO adapter (the default option) uses mpirun to generate master processes and workflows in different nodes of the cluster, and mpirun installed in the cluster does not allow this.

The following options are possible:

  • You do not specify --tasks_in_master = in the enqueue_compss command.
  • It is performed using the GAT adapter (--comm = integratedtoolkit.gat.master.GATAdaptor), which has more utility

The next version of COMPSs software will use the spawn command, which is available in different cluster resource managers (e.g. blaunch, srun), which should solve this problem.

+5
source

Source: https://habr.com/ru/post/1245322/


All Articles