If for panda frames in Python

I have a dataframe that looks like this:

timestamp 0 1 2 3 2013-04-17 05:00:00 4.335212 2655.140854 2655.140854 2655.140854 2013-04-17 05:10:00 2.224966 2655.140854 2655.140854 2655.140854 2013-04-17 05:20:00 2.409150 2655.140854 2655.140854 2655.140854 2013-04-17 05:30:00 2655.140854 2655.140854 2655.140854 2655.140854 

I need to impose an if statement criterion on every value in the data frame, I tried to use:

 dirt = dirt.astype(float) for ind, i in enumerate(dirt): if i < 0: dirt[ind] = i + 360 if i > 360: dirt[ind] = i - 360 

However, addition and subtraction do not occur on any of the values. Any ideas?

+5
source share
2 answers

You should use .iterrows() instead of enumerate(df) . When you do enumerate(df) , you simply get column names that do not match your condition. iterrows() returns an index and a string (like pandas.Series ) each iteration.

But for your requirement, you can iterate over df.columns and do what you want in a vectorized way for each column. Example -

 for col in df.columns: df.loc[df[col] < 0,col] += 360 df.loc[df[col] > 360,col] -= 360 

I use columns instead of rows , assuming the number of columns will be much less than the number of rows, so we will do the actual loop for much less iterations (and using vectorized addition for more data at the same time).

Demo -

 In [128]: df Out[128]: 0 1 2 3 timestamp 2013-04-17 05:00:00 4.335212 2655.140854 2655.140854 2655.140854 2013-04-17 05:10:00 2.224966 2655.140854 2655.140854 2655.140854 2013-04-17 05:20:00 2.409150 2655.140854 2655.140854 2655.140854 2013-04-17 05:30:00 2655.140854 2655.140854 2655.140854 2655.140854 In [134]: for col in df.columns: .....: df.loc[df[col] < 0,col] += 360 .....: df.loc[df[col] > 360,col] -= 360 .....: In [135]: df Out[135]: 0 1 2 3 timestamp 2013-04-17 05:00:00 4.335212 2295.140854 2295.140854 2295.140854 2013-04-17 05:10:00 2.224966 2295.140854 2295.140854 2295.140854 2013-04-17 05:20:00 2.409150 2295.140854 2295.140854 2295.140854 2013-04-17 05:30:00 2295.140854 2295.140854 2295.140854 2295.140854 
+3
source

You can use masking with where and update to update existing dataframe values ​​as follows:

 In [188]: df Out[188]: 0 1 2 3 timestamp 2013-04-1705:00:00 4.335212 2655.140854 2655.140854 2655.140854 2013-04-1705:10:00 2.224966 2655.140854 2655.140854 2655.140854 2013-04-1705:20:00 2.409150 2655.140854 2655.140854 2655.140854 2013-04-1705:30:00 2655.140854 2655.140854 2655.140854 2655.140854 In [189]: df_small = df.where(df < 0).apply(lambda x: x + 360) In [190]: df_small Out[190]: 0 1 2 3 timestamp 2013-04-1705:00:00 NaN NaN NaN NaN 2013-04-1705:10:00 NaN NaN NaN NaN 2013-04-1705:20:00 NaN NaN NaN NaN 2013-04-1705:30:00 NaN NaN NaN NaN In [191]: df_large = df.where(df > 360).apply(lambda x: x - 360) In [192]: df_large Out[192]: 0 1 2 3 timestamp 2013-04-1705:00:00 NaN 2295.140854 2295.140854 2295.140854 2013-04-1705:10:00 NaN 2295.140854 2295.140854 2295.140854 2013-04-1705:20:00 NaN 2295.140854 2295.140854 2295.140854 2013-04-1705:30:00 2295.140854 2295.140854 2295.140854 2295.140854 

 In [193]: df.update(df_small) In [194]: df.update(df_large) In [195]: df Out[195]: 0 1 2 3 timestamp 2013-04-1705:00:00 4.335212 2295.140854 2295.140854 2295.140854 2013-04-1705:10:00 2.224966 2295.140854 2295.140854 2295.140854 2013-04-1705:20:00 2.409150 2295.140854 2295.140854 2295.140854 2013-04-1705:30:00 2295.140854 2295.140854 2295.140854 2295.140854 

Note:

This can potentially satisfy angular cases if you have conditions such as: "value" <360, then +360 else -360, but the update sequence will lead to the reapplication of the results, i.e. 1 + 360 = 361, then 361> 360 so that it again becomes 1.

But for your use case, I think the @AnandSKumar method is very clean and close to what you are looking for.

+3
source

Source: https://habr.com/ru/post/1234117/


All Articles