I want to do an outer join that includes 3 tables. I tried with this:
features = JOIN group_event by group left outer, group_session by group, group_order by group;
I want all the group_event lines to be present in the output file, even if one or none of the other two relationships matches this.
The above command does not work. Obviously, since it should not work (http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#JOIN+%28outer%29)
Outer joins will only work for two-way joins; to perform a multi-way outer join, you will need to perform multiple two-way outer join statements.
Split works and can be performed as follows:
features1 = JOIN group_event by group left outer, group_session by group; features2 = JOIN features1 by group_event::group left outer, group_order by group;
Any ideas to do this in one team? (It would be helpful if I joined even more tables)
source share