Slurmctld: fatal: MISMATCH CLUSTER NAME

How do I start slurm:

mkdir -p /tmp/slurmstate/clustername
sudo slurmd
sudo munged -f
/etc/init.d/munge start 
sudo slurmdbd
sudo slurmctld -c

-

sacctmgr list cluster
   Cluster     ControlHost  ControlPort   RPC     Share GrpJobs       GrpTRES GrpSubmit MaxJobs       MaxTRES MaxSubmit     MaxWall                  QOS   Def QOS
---------- --------------- ------------ ----- --------- ------- ------------- --------- ------- ------------- --------- ----------- -------------------- ---------
   cluster                            0  7936         1                                                                                           normal

Running slurmctld -cDgives me the following error. The cluster name returns some invalid string that I don't know. How can i fix this?

> slurmctld -cD
slurmctld: fatal: CLUSTER NAME MISMATCH.
slurmctld has been started with "ClusterName=     ", but read "cluster" from the state files in StateSaveLocation.
Running multiple clusters from a shared StateSaveLocation WILL CAUSE CORRUPTION.
Remove /tmp/slurmstate/clustername to override this safety check if this is intentional (e.g., the ClusterName has changed).

Note. When I try to run slurm as the root user and switch back, this problem starts. I had to reinstall mysql to fix it.

Thank you for your valuable time and help.

+4
source share
1 answer

I am a complete SLURM noob (just starting to be interested in this for work), so I apologize if I make erroneous offers, but I think I can point out something is wrong.

The first line in your startup sequence:

mkdir -p /tmp/slurmstate/clustername

, , clustername - .

, ( fopen fgets, . ).

, fopen - , ( ...). , , .

:

  • rmdir /tmp/slurmstate/clustername

  • mkdir -p /tmp/slurmstate, slurmstate, , clustername ( !).

clustername , slurm.conf. , slurm.conf, .

PS: , , , ... , , - , , . , ( ) (, , ).

+2

Source: https://habr.com/ru/post/1678872/


All Articles