Numpy.polyfit does not process NaN values

Question

Numpy.polyfit does not process NaN values

I have a problem with this Python code snippet:

import matplotlib matplotlib.use("Agg") import numpy as np import pylab as pl A1=np.loadtxt('/tmp/A1.txt',delimiter=',') A1_extrema = [min(A1),max(A1)] A2=np.loadtxt('/tmp/A2.txt',delimiter=',') pl.close() ab = np.polyfit(A1,A2,1) print ab fit = np.poly1d(ab) print fit r2 = np.corrcoef(A1,A2)[0,1] print r2 pl.plot(A1,A2,'r.', label='TMP36 vs. DS18B20', alpha=0.7) pl.plot(A1_extrema,fit(A1_extrema),'c-') pl.annotate('{0}'.format(r2) , xy=(min(A1)+0.5,fit(min(A1))), size=6, color='r' ) pl.title('Sensor correlations') pl.xlabel("T(x) [degC]") pl.ylabel("T(y) [degC]") pl.grid(True) pl.legend(loc='upper left', prop={'size':8}) pl.savefig('/tmp/C123.png')

A1 and A2 are arrays containing temperature readings from different sensors. I want to find a correlation between them and show it graphically. However, sometimes sensor read errors occur. And in this case, NaN is inserted into one of the files instead of the temperature value. Then np.polyfit refuses to match the data and returns [nan, nan] as a result. After that, everything else fails.

My question is: how can I convince numpy.polyfit ignore NaN values? NB: The data sets are relatively small at the moment. I expect that they can grow to about 200 thousand .... 600 thousand Elements after deployment.

+6

python numpy nan

Mausy5043 Feb 21 '15 at 14:57

source share

1 answer

Tomho · Accepted Answer · 2016-05-20T21:01:20+0000

I know this is a little old, but if you have arrays that have NaNs in you, you need to "clear them" only by considering the final indexes. Way to do it

 idx = np.isfinite(x) & np.isfinite(y) ab = np.polyfit(x[idx], y[idx], 1)

This way you only pass “good” points for polyphyte.

Numpy.polyfit does not process NaN values

More articles: