Gnuplot Coloring Intervals for Missing Values

Question

Gnuplot Coloring Intervals for Missing Values

I have time data where some time intervals contain only missing values. I want to explicitly specify those ranges of missing values.

Currently, the solution that I have is to check if the value is NaN or not, as such:

 plot file_name using 1:(stringcolumn(num_column) eq "NaN" ? 1/0 : column(num_column)) with lines,\ "" using 1:(stringcolumn(num_column) eq "NaN" ? 1000 : 1/0) with points

This will draw points at y = 1000 instead of a row for missing values, which will give the following result:

However, this is not ideal, because: a) I need to specify a y value for drawing points, and b) it is pretty ugly, especially when the data set is longer.

I would like to do something like this:

That is, completely fill this interval with color (possibly with some transparency, unlike my image). Please note that in these examples there is only one interval of missing values, bu in fact, there can be any number of them on one chart.

+5

gnuplot

Fatalize Feb 23 '16 at 15:25

source share

2 answers

Using two filled curves

A somewhat “hacky” way to do this is to use two filled curves:

 plot file_name using 1:(stringcolumn(num_column) eq "NaN" ? 1/0 : column(num_column)) with lines ls 2,\ "" using 1:(stringcolumn(num_column) eq "NaN" ? 0 : 1/0) with filledcurve x1 ls 3,\ "" using 1:(stringcolumn(num_column) eq "NaN" ? 0 : 1/0) with filledcurve x2 ls 3

Both fillcurve must have the same linestyle, so we get one uniform rectangle.

One filled cube has x1 as a parameter, and the other x2 , so one fills above 0, and the other below 0.

You can remove the curve at 0 and make the fill transparent using this:

 set style fill transparent solid 0.8 noborder

This is the result:

Note that the dashed line at 0 below the rectangle is a bit buggy compared to other dashed lines. We also note that if some rectangles are very small in width, they will look lighter than expected.

+3

Fatalize Feb 23 '16 at 15:56

source share

Matthew · Accepted Answer · 2016-02-23T21:35:55+0000

We can do preprocessing to accomplish this. Suppose we have the following data file: data.txt

 1 8 2 6 4 NaN 5 NaN 6 NaN 7 9 8 10 9 NaN 10 NaN 11 6 12 11

and the following python 3 program (obviously using python is not the only way to do this), process.py ¹

 data = [x.strip().split() for x in open("data.txt","r")] i = 0 while i<len(data): if (data[i][1]=="NaN"): print(data[i-1][0],end=" ") # or use data[i][0] i+=1 while data[i][1]=="NaN": i+=1 print(data[i][0],end=" ") # or use data[i-1][0] else: i+=1

This python program will read the data file, and for each range of NaN values it will print out the last good and next good x-coordinates. In the case of the sample data file, it outputs 2 7 8 11 , which can be used as borders for drawing rectangles. Now we can do in gnuplot ²

 breaks = system("process.py") set for [i=0:words(breaks)/2-1] object (i+1) rectangle from word(breaks,2*i+1),graph 0 to word(breaks,2*i+2),graph 1 fillstyle solid noborder fc rgb "orange"

What will draw the filled rectangles above this range. It determines how many “blocks” (groups of two values) are in the breaks variable, and then reads the two at a time, using the gaps as the left and right borders for the rectangles.

Finally, build the data

 plot "data.txt" u 1:2 with lines

produces

which shows filled rectangles in the range of NaN values.

To ensure wider applicability, the following process.awk ³ awk program performs the same task as the above python program if awk is available and python is not:

 BEGIN { started = 0; last = ""; vals = ""; } ($2=="NaN") { if (started==0) { vals = vals " " last; started = 1; } } ($2!="NaN") { last = $1 if (started==1) { vals = vals " " last; started = 0; } } END { sub(/^ /,"",vals); print vals; }

We can use this by replacing the system call above with

 breaks = system("awk -f process.awk data.txt")

¹ Borders extend to the last and next point to completely fill the gap. If this is undesirable, the commented values will cover only the area identified by NaN in the file (4-6 and 8-10 in the example). The program will not process NaN values as the first or last data point.

² I used orange for spaces. Feel free to use any color specifications.

³ The awk program expands the boundaries in the same way as the python program, but requires more changes to get a different behavior. It has the same limitations in that it does not process NaN values as the first or last data point.

Gnuplot Coloring Intervals for Missing Values

Using two filled curves

More articles: