Pandas Colon Concatenation

Question

Pandas Colon Concatenation

I am trying to combine several columns that mainly contain NaN, but here is an example of only 2:

2013-06-18 21:46:33.422096-05:00 A NaN 2013-06-18 21:46:35.715770-05:00 A NaN 2013-06-18 21:46:42.669825-05:00 NaN B 2013-06-18 21:46:45.409733-05:00 A NaN 2013-06-18 21:46:47.130747-05:00 NaN B 2013-06-18 21:46:47.131314-05:00 NaN B

This can go on for 3 or 4 or 10 columns, always 1 pd.notnull() , and the rest is NaN.

I want to combine them into 1 column as quickly as possible. How can i do this?

+4

python pandas

user1610719 Jun 20 '13 at 15:39

source share

2 answers

You could do

 In [278]: df = pd.DataFrame([[1, np.nan], [2, np.nan], [np.nan, 3]]) In [279]: df Out[279]: 0 1 0 1 NaN 1 2 NaN 2 NaN 3 In [280]: df.sum(1) Out[280]: 0 1 1 2 2 3 dtype: float64

Since NaN are treated as 0 during summation, they are not displayed.

A few caveats: you must be sure that only one of the columns has a non-Nan for this. It will also work only with numeric data.

You can also use

 df.fillna(method='ffill', axis=1).iloc[:, -1]

The last column will now contain all valid observations, since the actual ones were filled in front. See the documentation here . The second method should be more flexible, but slower. I cut off each row and last column with iloc[:, -1] .

0

Tomugspurger Jun 20 '13 at 16:01

source share

Boud · Accepted Answer · 2013-06-20T17:19:21+0000

You get one row per row, and the remaining cells are NaN , then the mathematics that are applied should request the value max :

  df.max(axis=1)

According to the comment, if it doesn't work in Python 3, project the NaN into lines:

 df.fillna('').max(axis=1)

Pandas Colon Concatenation

More articles: