Concurrency in an Oozie workflow: how much and how to throttle

Suppose we have an Oozie workflow that has a node copy action and then a node shell action. Can I run multiple instances of such an OOzie workflow and run them in parallel? How about concurrency numbers could be at a thousand and / or even a millionth level. Is this possible, or even Oozi maintains this high level of concurrency?

If not, then we will need to consider throttling and provide a limit on the number of concurrent Oozie workflow instances. We would rather choke this on the server side / Oozie (mostly with any Oozie software functionality) rather than on the client / called party. For example, we have a huge script run with such lines. We want to run this in one shot, and then let Oozi figure out how to drown all these instances on himself. We do not want to break it into several smaller pieces, and then run one piece at a time.

oozie job -oozie http://myhost.com:11000/oozie -config job1.properties -run
oozie job -oozie http://myhost.com:11000/oozie -config job2.properties -run
......
oozie job -oozie http://myhost.com:11000/oozie -config job1000000.properties -run
+4
source share
1 answer

Oozie concurrency, , Shell MR .

, - Oozie. concurrency. Oozie <concurrency>, , , <throttle>, , , concurrency .

: https://oozie.apache.org/docs/3.1.3-incubating/CoordinatorFunctionalSpec.html#a6.3._Synchronous_Coordinator_Application_Definition

, Oozie , 5 , . 5 , . oozie.service.CoordMaterializeTriggerService.lookup.interval ( ) oozie-site.xml.

+6

Source: https://habr.com/ru/post/1525672/


All Articles