I find very strange behavior (IMHO) with some data being loaded into pandas from a CSV file. To protect innocence, let's point out that the DataFrame is in the homes variable and, among others, has the following columns:
In [143]: homes[['zipcode', 'sqft', 'price']].dtypes Out[143]: zipcode int64 sqft int64 price int64 dtype: object
To get the average price in each zip code, I tried:
In [146]: homes.groupby('zipcode')[['price']].mean().head(n=5) Out[146]: price zipcode 28001 280804 28002 234284 28003 294111 28004 1355927 28005 810164
Oddly enough, the average price is int64, as shown in the figure:
In [147]: homes.groupby('zipcode')[['price']].mean().dtypes Out[147]: price int64 dtype: object
I can not imagine any technical reason why the average value of a number does not advance to swimming. Moreover, just adding one more column, the price will become float64 as I expected it to be all the time:
In [148]: homes.groupby('zipcode')[['price', 'sqft']].mean().dtypes Out[148]: price float64 sqft float64 dtype: object price sqft zipcode 28001 280804.690608 14937.450276 28002 234284.035176 7517.633166 28003 294111.278571 10603.096429 28004 1355927.097792 13104.220820 28005 810164.880952 19928.785714
So that I would not miss something very obvious, I created another very simple DataFrame ( df ), but with this it does not appear:
In [161]: df[['J','K']].dtypes Out[161]: J int64 K int64 dtype: object In [164]: df[['J','K']].head(n=10) Out[164]: JK 0 0 -9 1 0 -14 2 0 8 3 0 -11 4 0 -7 5 -1 7 6 0 2 7 0 0 8 0 5 9 0 3 In [165]: df.groupby('J')[['K']].mean() Out[165]: K J -2 -2.333333 -1 0.466667 0 -1.030303 1 -1.750000 2 -3.000000
Note that in one K: int64 column, grouped by J, another int64, the average value is a direct float value. homes DataFrame was read from the provided CSV file, df was created in pandas, written to CSV and then read.
Last but not least, I use pandas 0.16.2.