How to apply a function to multiple columns in a pandas data frame at a time

I often deal with data that is poorly formatted (Ie the number of fields is not consistent, etc.)

There may be other ways that I don’t know about, but the way to format a single column in a data framework is to use a function and map the column to that function.

format = df.column_name.map(format_number) 

Question: 1 - what if I have a framework with 50 columns and want to apply this formatting to multiple columns, etc. columns 1, 3, 5, 7, 9,

You can go:

 format = df.1,3,5,9.map(format_number) 

.. So I could format all my columns on one row?

+3
python pandas slice filtering
Feb 28 '14 at 4:59
source share
2 answers

You can do df[['Col1', 'Col2', 'Col3']].applymap(format_number) . Note that this will result in the return of new columns; it will not modify an existing DataFrame. If you want to return the values ​​to the original, you will need to do df[['Col1', 'Col2', 'Col3']] = df[['Col1', 'Col2', 'Col3']].applymap(format_number) .

+7
Feb 28 '14 at 6:06
source share
β€” -

You can use apply as follows:

 df.apply(lambda row: format_number(row), axis=1) 

You need to specify the columns, although in your function format_number :

 def format_number(row): row['Col1'] = doSomething(row['Col1'] row['Col2'] = doSomething(row['Col2']) row['Col3'] = doSomething(row['Col3']) 

This is not as elegant as @BrenBarn's answer, but it has the advantage that the data map is changed in place, so you don't need to assign columns again

0
Feb 28 '14 at 8:24
source share



All Articles