Iterate over Pandas framework using List List

Question

Iterate over Pandas framework using List List

I can solve this other way; but I am interested to understand why trying to iterate over pandas DataFrame with a list does not work. (Here a is the Dataframe)

 def func(a,seed1,seed2): for i in range(0,3): # Sum of squares. Results in a series containing 'date' and 'num' sorted1 = ((a-seed1)**2).sum(1) sorted2 = ((a-seed2)**2).sum(1) # This makes a list out of the dataframe. a = [a.ix[i] for i in a.index if sorted1[i]<sorted2[i]] b = [a.ix[i] for i in a.index if sorted1[i]>=sorted2[i]] # The above line throws the exception: # TypeError: 'builtin_function_or_method' object is not iterable # Throw it back into a dataframe... a = pd.DataFrame(a,columns=['A','B','C']) b = pd.DataFrame(b,columns=['A','B','C']) # Update the seed. seed1 = a.mean() seed2 = b.mean() print a.head() print "I'm computing."

+4

python pandas dataframe

Michele reilly Aug 20 '13 at 16:24

source share

1 answer

Andy hayden · Answer 1 · 2013-08-20T16:54:07+0000

The problem occurs after the first line, a is no longer a DataFrame:

 a = [a.ix[i] for i in a.index if sorted1[i]<sorted2[i]] b = [a.ix[i] for i in a.index if sorted1[i]>=sorted2[i]]

This is a list, and therefore does not have an index attribute (hence, errors).

One python trick is to do this on a single line (define them at the same time), i.e.:

 a, b = [a.ix[i] for ...], [a.ix[i] for ...]

perhaps the best option is to use a different variable name here (e.g. df).

As you say, there are better ways to do this in pandas, the use of a mask is obvious:

 msk = sorted1 < sorted2 seed1 = df[msk].mean() seed2 = df[~msk].mean()

Iterate over Pandas framework using List List

More articles: