I use Pig to analyze application logs to find out which public methods were called by a user who was not called last month (by the same user).
I managed to get methods called grouped users until the last month and after the last month:
BEFORE THE PREVIOUS MONTH EXAMPLE OF RELATIONSHIP
u1 {(m1),(m2)} u2 {(m3),(m4)}
AFTER last monthβs relationship pattern
u1 {(m1),(m3)} u2 {(m1),(m4)}
What I want is to find to users what methods are in AFTER that are not in FRONT, i.e.
Expected Result NEWLY_CALLED
u1 {(m3)} u2 {(m1)}
Question: How can I do this in Pig? can bags be deducted?
I tried the DIFF function but does not perform the expected subtraction.
Hi,
Joel
source share