Running a task on multiple nodes of a GridEngine cluster

I have access to a 128-core cluster on which I would like to run a parallel job. The cluster uses Sun GridEngine, and my program is written to run using Parallel Python, numpy, scipy in Python 2.5.8. Running the task on one node (4-cores) gives a 3.5-fold improvement over a single core. Now I would like to go to the next level and divide the work into ~ 4 nodes. My qsubscript looks something like this:

#!/bin/bash
# The name of the job, can be whatever makes sense to you
#$ -N jobname

# The job should be placed into the queue 'all.q'.
#$ -q all.q

# Redirect output stream to this file.
#$ -o jobname_output.dat

# Redirect error stream to this file.

#$ -e jobname_error.dat

# The batchsystem should use the current directory as working directory.
# Both files will be placed in the current
# directory. The batchsystem assumes to find the executable in this directory.
#$ -cwd

# request Bourne shell as shell for job.
#$ -S /bin/sh

# print date and time
date

# spython is the server version of Python 2.5. Using python instead of spython causes the program to run in python 2.3
spython programname.py

# print date and time again
date

Does anyone know how to do this?

+3
source share
1 answer

, Grid Engine -np 16 script :

# Use 16 processors
#$ -np 16

script. , , .sge_request.

GE, - , 16 ( ) , , 4 , 4 , 8 2 . , , 2 8 ( , ) , .

+2

Source: https://habr.com/ru/post/1768231/


All Articles