Hadoop Streaming and multiple gear steps without matching between each step

I understand how to sort my data several times, but not return to them every time.

Id like to tune: mapper 1 → gearbox 1 ---> gearbox 2 ---> gearbox 3

I want to make a conclusion 1 gear (key, data), and then immediately go to gear 2 ... is this possible?

I found out about troubleshooting that you can link tasks, but does this require a mapping for each step?

Whenever I try to run without display, it ends up in error. It seems that a working cartographer for every step would be a waste of time / resources if I can just pull it out of gear 1 as needed.

Thoughts?

+4
source share
1 answer

In short, if you use Java, ChainReducer and ChainMapper are what you need. Using these classes, you can add an arbitrary number of reducers or cards in a chain in any order.

The book "Hadoop in Action" describes this procedure in Chapter 5.

+1
source

Source: https://habr.com/ru/post/1433670/


All Articles