Instability of pandas data calculations

Question

Instability of pandas data calculations

I am wondering if anyone has seen this problem with Pandas before. Basically, I'm trying to add, multiply, and split DataFrames in stages (all frames have the same indexes and columns), but Pandas splashes out different results for the same calculation, performed in sequence.

An image of some sample output is shown below. I used .values in the code below because for display purposes, but instability can occur when using .add() , .mul() or .div() . For example, if I re-find N11.add(N00) , I usually get the correct answer, but sometimes (every 4 or 5 times), I get a DataFrame filled with 0s.

If that matters, I'm on Windows 10 using the Anaconda Pandas 0.17.0 distribution (from Python 2.7.10 to Spyder 2.3.7). The frames with which I work are large (6856 to 12511). Has anyone else encountered this problem? Is this a known issue, or am I doing something wrong?

+5

python numpy pandas

user3111891 Nov 10 '15 at 19:52

source share

1 answer

mactyr · Accepted Answer · 2016-02-03T05:42:01+0000

Today I ran into a similar problem and this is caused by an error in numexpr 2.4.4 . It seems that some pandas users bite differently, as reported in this pandas ticket and others related to it.

Updating numexpr to 2.4.6 solved the problem for us, but it looks like any version that is not 2.4.4 should be fine!

Instability of pandas data calculations

More articles: