Different result of calculating the average value on two computers

Question

Different result of calculating the average value on two computers

I have two computers with python 2.7.2 (MSC v.1500 32 bit (Intel)] on win32) and numpy 1.6.1. But

numpy.mean(data)

returns

 1.13595094681 on my old computer

and

 1.13595104218 on my new computer

Where

 Data = [ 0.20227873 -0.02738848 0.59413314 0.88547146 1.26513398 1.21090782 1.62445402 1.80423951 1.58545554 1.26801944 1.22551131 1.16882968 1.19972098 1.41940248 1.75620842 1.28139281 0.91190684 0.83705413 1.19861531 1.30767155]

In both cases

 s=0 for n in data[:20]: s+=n print s/20

gives

 1.1359509334

Can someone explain why and how to avoid?

Mads

+4

python-2.7 numpy

Mads m pedersen Nov 16 '12 at 17:05

source share

2 answers

This is because you have Float32 arrays (single precision). With one precision, operations are performed only with an accuracy of 6 decimal places. Therefore, your results match up to the 6th decimal place (after the decimal point, rounding the last digit), but after that they are not accurate. After that, different architectures / machines / compilers will give different results. If you want to get the same results, you must use arrays with higher precision (e.g. Float64 ).

0

tiago Nov 17 '12 at 4:40

source share

bogatron · Accepted Answer · 2012-11-16T21:19:19+0000

If you want to avoid any differences between them, make them explicitly 32-bit or 64-bit floating point arrays. NumPy uses several other libraries, which can be 32 or 64 bits. Please note that rounding can also occur in your print statements:

 >>> import numpy as np >>> a = [0.20227873, -0.02738848, 0.59413314, 0.88547146, 1.26513398, 1.21090782, 1.62445402, 1.80423951, 1.58545554, 1.26801944, 1.22551131, 1.16882968, 1.19972098, 1.41940248, 1.75620842, 1.28139281, 0.91190684, 0.83705413, 1.19861531, 1.30767155] >>> x32 = np.array(a, np.float32) >>> x64 = np.array(a, np.float64) >>> x32.mean() 1.135951042175293 >>> x64.mean() 1.1359509335 >>> print x32.mean() 1.13595104218 >>> print x64.mean() 1.1359509335

Another point that should be noted is that if you have lower-level libraries (for example, atlas, lapak) that are multi-threaded, then for large arrays you can have a difference in the results regardless of the possible variable order of operations and accuracy with floating point.

In addition, you are at the limit of accuracy for 32-bit numbers:

 >>> x32.sum() 22.719021 >>> np.array(sorted(x32)).sum() 22.719019

Different result of calculating the average value on two computers

More articles: