Parse ns2 trace file

I am using NS 2.35 and trying to determine the end-to-end delay of my routing algorithm.

I think that anyone with good scripting experience will be able to answer this question, unfortunately this person is not me.

I have a trace file that looks something like this:

- -t 0.548 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1052 -a 0 -x {2.0 17.0 6 ------- null} h -t 0.548 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1052 -a 0 -x {2.0 17.0 -1 ------- null} + -t 0.55 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1056 -a 0 -x {2.0 17.0 10 ------- null} + -t 0.555 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1057 -a 0 -x {2.0 17.0 11 ------- null} r -t 0.556 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1047 -a 0 -x {2.0 17.0 1 ------- null} + -t 0.556 -s 7 -d 12 -p cbr -e 500 -c 0 -i 1047 -a 0 -x {2.0 17.0 1 ------- null} - -t 0.556 -s 7 -d 12 -p cbr -e 500 -c 0 -i 1047 -a 0 -x {2.0 17.0 1 ------- null} 

But here is what I need to do.

The line starting with + is when a new package is added to the network. The line starting with r is when the packet was received by the destination. double typed number after -t - time at which this event occurred. And finally, after -i is the identity of the package.

To calculate the average end-to-end delay, I need to find each line with a specific identifier after -i. from there i need to calculate the timestamp r minus the timestamp +

So, I suppose there may be a regular expression separated by spaces. I could put each of the segments in my own variables. Then I would check the 15th (packet identifier).

But I'm not sure where to go from there, or how to put it all together.

I know there are some AWK scripts on the Internet for this, but they are outdated and do not fit the current format (and I'm not sure how to change them).

Any help would be greatly appreciated.

EDIT:

Here is an example of the complete package route I am looking for. I took a lot of lines between these, so you can see individual package events.

 # a packet is enqueued from node 2 going to node 7. It ID is 1636. this was at roughly 1.75sec + -t 1.74499999999998 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # at 2.1s, it left node 2. - -t 2.134 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # at 2.134 it hopped from 2 to 7 (not important) h -t 2.134 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 -1 ------- null} # at 2.182 it was received by node 7 r -t 2.182 -s 2 -d 7 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # it was the enqueued by node 7 to be sent to node 12 + -t 2.182 -s 7 -d 12 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # slightly later it left node 7 on its was to node 12 - -t 2.1832 -s 7 -d 12 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # it hopped from 7 to 12 (not important) h -t 2.1832 -s 7 -d 12 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 -1 ------- null} # received by 12 r -t 2.2312 -s 7 -d 12 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # added to queue, heading to node 17 + -t 2.2312 -s 12 -d 17 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # left for node 17 - -t 2.232 -s 12 -d 17 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} # hopped to 17 (not important) h -t 2.232 -s 12 -d 17 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 -1 ------- null} # received by 17 notice the time delay r -t 2.28 -s 12 -d 17 -p cbr -e 500 -c 0 -i 1636 -a 0 -x {2.0 17.0 249 ------- null} 

The ideal exit script will recognize 2.134 as the start time and 2.28 as the end, and then give me a delay of 0.144 seconds. This would do this for all packet identifiers and report only the average.

I was asked to expand a bit on how the file works and what I expect.

The file lists descriptions of about 10,000 packages. Each packet may be in a different state. The important states are + , which means that the packet was queued on the router, and r means that the packet was received by its recipient.

It is possible that a packet that is in the queue (so that the + record) is not actually received and is deleted instead. This means that we cannot assume that for each record + will be a record r .

What I'm trying to measure is the average end-to-end delay. This means that if you look at one package, it will have the time at which it was installed and the time when it was received. I need to do this calculation to find its end-to-end delay. But I also need to do this for 9999 other packages to get the average.

I thought about it more, and, as a rule, I believe that the algorithm should work.

  • delete all lines that do not start with + or r , because they are irrelevant.
  • go through all the packet identifiers (that is, the numbers after -i, such as 1052 in the example), and put them in some groups (possibly several arrays).
  • each group should now contain all the information about a particular package.
  • inside the group, check if there is +, ideally we want the very first +. Record your time.
  • find more + lines. Look at your time. Perhaps the magazine is a bit confused. Thus, it is possible that there is a + line, which actually was previously in the simulation.
  • If this new line + has an earlier time, update it with a time variable.
  • Assuming there are no more lines + , find the line r .
  • if there is no r line, the package has been removed, so don't worry about it.
  • for every line r you find, all we have to do is find the one with the most recent timestamp
  • The line r with the last timestamp is where the packet was finally received.
  • subtract time + from r , this gives us the time needed to move the packet.
  • Add this value to the array so that it can later be averaged.
  • repeat this process in each group of packet identifiers, and then finally average the created array of delays.

It prints a lot, but I think it is as clear as I can be what I want. I wish I was a master of regular expressions, but I just don’t have time to learn it well enough to take it off.

Thanks for your help and let me know if you have any questions.

+4
source share
1 answer

There is not much here, as Ian said in the comments on your question, but if I understand what you want to do correctly, something like this should work:

 awk '/^[+r]/{$1~/r/?r[$15]=$2:r[$15]?d[$15]=r[$15]-$2:1} END {for(p in d){sum+=r[p];num++}print sum/num}' trace.file 

It skips all lines that do not begin with '+' or 'r'. If the string begins with "r", it adds time to the r array. Otherwise, it computes the delay and adds it to the d array if the element is found in the r array. Finally, it iterates over the elements in the d array, summarizes the total delay and the number of elements, and computes the average of that. In your case, the average is 0.

:1 at the end of the main block is only there, so I can leave with a ternary expression, and not with a much more detailed if statement.

EDIT: New expression for working with added conditions:

 awk '/^[+r]/{$1~/r/?$3>r[$15]?r[$15]=$3:1:!a[$15]||$3<a[$15]?a[$15]=$3:1} END {for(i in r){sum+=r[i]-a[i];num++}print "Average delay", sum/num}' 

or as awk file

 /^[+r]/ { if ($1 ~ /r/) { if ($3 > received[$15]) received[$15] = $3; } else { if (!added[$15] || $3 < added[$15]) added[$15] = $3; } } END { for (packet in received) { sum += received[packet] - added[packet]; num++ } print "Average delay", sum/num } 

According to your algorithm, it seems that 1.745 will be the start time, while you write that 2.134 is.

+3
source

Source: https://habr.com/ru/post/1384038/


All Articles