Correspondence between storm.yaml supervisor.slots.ports and the call to the Config.setNumWorkers (#workers) method

other users of Storm:

Recommendations for setting up a storm cluster ( https://github.com/nathanmarz/storm/wiki/Setting-up-a-Storm-cluster ) indicate that the supervisor.slots.ports configuration property should be set so that for every worker on the machine you select a separate port.

I understand that every worker is an instance of the JVM that listens for commands from the nimbus controller. Therefore, it makes sense that each of them listens on a separate port.

However, there is also a backtype.storm.Config method, which seems to allow you to determine the number of workers. What if the setNumWorkers call tries to install more workers than you configured for ports? It would seem that it ruined everything.

The only thing that makes sense to me is that the yaml configuration defines the upper bound on the number of employees. Each topology may require some workers. But if I presented two topologies (in a specific cluster), each making a call to Config.setNumWorkers (2), then I would be better off setting up four ports.

Is this the right idea?

Thanks in advance. -Chris

+6
source share
3 answers

Well, I think the top score was correct. I created a single machine cluster on my laptop, then created ExclamationTopology (from a storm starter). I only created two workers, but ExclamationTopology has a call> conf.setNumWorkers (3);

But when I look at the assault interface, it tells me that "Num Workers" is 2.

So it looks like you set the upper limit in the storm.yaml file, and if you ask for more workers than you configured the ports, then you just get maximum access.

(caution: I just fall into this material, and I am by no means an expert, so there is a chance that I missed something. But the above report is what I observed.)

+4
source

Basically you got it right.

There is an important difference between slots and workers. Slots are places where workers can be realized. When you configure a supervisor with, say, 10 slots, you configure it to run up to 10 workers at the same time over that supervisor. If you request more workers than slots, Storm will do its best to schedule work in the available slots (in some cases, this means, for example, that a worker can enter the slot, do some work and then replace with another worker, so the topology can continue ), in a sense, not differently than the OS, plans processes for working on a limited number of “slots” (processors / cores / hyper-threads / something else) that it has.

+1
source

supervisor.slots.ports is a hard limit for the entire storm cluster for the number of workers.

and

Config.setNumWorkers (#workers) is the soft limit for this topology for the number of employees.

which means Config.setNumWorkers (#workers) <= supervisor.slots.ports.

lets say that we have only 8 ports. The topology will configure the number of workers to 6. It will receive 6 out of 8, and the remaining 2 working ports will not be used.

+1
source

Source: https://habr.com/ru/post/956804/


All Articles