I recently installed anaconda 4.3, and when I try to make a sum using df ["x"]. sum () or np.sum (df ["x"]) with pandas frame read from excel, I get the wrong answer. I also get the wrong answer when trying to use np.sum (x) or x.sum () in a random row, but when I np.sum (x) is a random numpy array, it gets the correct answer. This only happens after installing a new version of anaconda and databases with more than 50,000 rows. My current setup is anaconda 4.3, pandas 0.19.2, numpy 1.11.3
Invalid pandas amount:
import numpy as np
import pandas as pd
a = np.random.randint(35000, high = 80000, size = 100000, dtype = np.int64)
b = np.random.randint(1, high = 100, size = 100000)
c = np.sum(a)
d = pd.DataFrame({"a": a, "b": b})
e = d["a"].sum()
f = np.sum(d["a"])
print(c,e,f)
5752269581 1457302285 1457302285
Invalid data amount:
import numpy as np
import pandas as pd
w = pd.read_excel("C:\\Users\\asistentecrm\\Downloads\\Prueba2.xlsx")
x = w["Vlr. Neto"].sum()
y = w["Vlr. Neto"].values
z = np.sum(y)
print (x, z)
-423117005 3871850291
Sorry for my english and thanks for the help.