I seem to have found a trap using .sum()on numpyarrays, but I cannot find an explanation. In fact, if I try to summarize a large array, then I start to get meaningless answers, but this happens silently, and I can not understand the output well enough for Google.
For example, this works exactly as expected:
a = sum(xrange(2000))
print('a is {}'.format(a))
b = np.arange(2000).sum()
print('b is {}'.format(b))
Providing the same output for both:
a is 1999000
b is 1999000
However, this does not work:
c = sum(xrange(200000))
print('c is {}'.format(c))
d = np.arange(200000).sum()
print('d is {}'.format(d))
The output of the following result:
c is 19999900000
d is -1474936480
And on an even larger array, you can get a positive result. This is more insidious, because I can’t determine that something unusual is happening at all. For instance:
e = sum(xrange(100000000))
print('e is {}'.format(e))
f = np.arange(100000000).sum()
print('f is {}'.format(f))
Gives the following:
e is 4999999950000000
f is 887459712
I assumed that this is due to data types, and even when using python float, there seems to be a problem:
e = sum(xrange(100000000))
print('e is {}'.format(e))
f = np.arange(100000000, dtype=float).sum()
print('f is {}'.format(f))
Donation:
e is 4999999950000000
f is 4.99999995e+15
Comp. Sci. (, ). , :
numpy . ; , , , MemoryError.- - 32- (, ); nope, , 64- .
sum; nope (?) , , .
-, , , , , ? , , dtype, ?
, :
Windows 7
numpy 1.11.3
Enthought Canopy Python 2.7.9