Pandas: fill in nans subject to conditions

I am struggling with something that seemed trivial, but apparently it is not. General image: data- pandas dataframe - contains (among others) TOTAL_VISITSand columns NUM_PRINTS.

Purpose: given parameter NUM_PRINTS, find the lines where NUM_prints = num_printsand fill with the nangiven number.

Where I left off, and that didn't make sense:

indices= data['NUM_PRINTS'] == num_prints

data.loc[indices,'TOTAL_VISITS'].fillna(5,inplace=True)

This should work the way I know and read. didn’t fill the nana with anything in practice, it seemed that it worked with a copy or something, since nothing had changed in the original object.

What works:

data.loc[indices,'TOTAL_VISITS'] = 2

this fills column 2 with a condition num_print, but does not consider nans.

data['TOTAL_VISITS'].fillna(0, inplace=True)

this fills nans in general visits from 0, but does not account for the condition NUM_PRINTS.

, for .iloc, , .

+4
2

, fillna :

np.random.seed(1213)

c = ['TOTAL_VISITS', 'A', 'NUM_PRINTS']
data = pd.DataFrame(np.random.choice([1,np.nan,3,4], size=(10,3)), columns=c)
print (data)
   TOTAL_VISITS    A  NUM_PRINTS
0           1.0  4.0         4.0
1           NaN  3.0         1.0
2           1.0  1.0         1.0
3           4.0  3.0         3.0
4           1.0  3.0         4.0
5           4.0  4.0         3.0
6           4.0  1.0         4.0
7           NaN  4.0         3.0
8           NaN  NaN         3.0
9           3.0  NaN         1.0


num_prints = 1
indices= data['NUM_PRINTS'] == num_prints
data.loc[indices,'TOTAL_VISITS'] = data.loc[indices,'TOTAL_VISITS'].fillna(100)
#alternative
#data.loc[indices,'TOTAL_VISITS'] = data['TOTAL_VISITS'].fillna(100)
print (data)
   TOTAL_VISITS    A  NUM_PRINTS
0           1.0  4.0         4.0
1         100.0  3.0         1.0
2           1.0  1.0         1.0
3           4.0  3.0         3.0
4           1.0  3.0         4.0
5           4.0  4.0         3.0
6           4.0  1.0         4.0
7           NaN  4.0         3.0
8           NaN  NaN         3.0
9           3.0  NaN         1.0
+1

fillna - , . .loc .

@jezrael.

num_prints = 1

mask = (data['NUM_PRINTS'] == num_prints) & data['TOTAL_VISITS'].isnull()

data.loc[mask, 'TOTAL_VISITS'] = 100
0

Source: https://habr.com/ru/post/1695307/


All Articles