If you have GNU awk , you can try something like this -
gawk ' NR==FNR{a[$2]=$1;next} !($2 in a) {print $2,$1; next} ($2 in a) { "date +%s -d " $1 | getline var1; "date +%s -d " a[$2] | getline var2; var3 = var2 - var1; if (var3 > 4) print $2, $1, a[$2] }' output.log input.log
Test:
[jaypal:~/Temp] cat input.log 2012-01-16T09:00:00 9 2012-01-16T10:00:00 10 2012-01-16T11:00:00 11 [jaypal:~/Temp] cat output.log 2012-01-16T10:00:04 10 2012-01-16T11:00:10 11 2012-01-16T12:00:00 12 [jaypal:~/Temp] gawk ' NR==FNR{a[$2]=$1;next} !($2 in a) {print $2,$1; next} ($2 in a) {"date +%s -d " $1 | getline var1; "date +%s -d " a[$2] | getline var2;var3=var2-var1;if (var3>4) print $2,$1,a[$2] }' output.log input.log 9 2012-01-16T09:00:00 11 2012-01-16T11:00:00 2012-01-16T11:00:10
Explanation:
We start by storing the first field in the output.log file in an array indexed in the second field. We use next to prevent other pattern{action} statements from starting. Using NR==FNR allows you to completely clear the output.log file.
!($2 in a) {print $2,$1; next}
Once the output.log file is complete. Let's start with the input.log file. We check if there is any second field in the input.log file in our array (for example, in the output.log file). If found, we will print it. We continue this action until we print all of these fields.
($2 in a) {"date +%s -d " $1 | getline var1; "date +%s -d " a[$2] | getline var2; var3=var2-var1; if (var3 > 4) print $2,$1,a[$2] }
In this we are looking for fields in which present in both files. When we find these fields, we need to enter our logic in order to calculate the difference. We use the system command to find the date. Now the system command prints to STDOUT by default, and we do not control them. Thus, we process the output and fix the output using the awk getline function and save it in a variable (var1 and var2). As soon as both dates are stored in a variable, we make the difference and save it in var3, if the found value of var3 is> 4, we print it in the desired format.
source share