Using a dictionary to replace column values ​​by given index numbers on a pandas frame

Consider the following data block

df_test = pd.DataFrame( {'a' : [1, 2, 8], 'b' : [np.nan, np.nan, 5], 'c' : [np.nan, np.nan, 4]}) df_test.index = ['one', 'two', 'three'] 

which gives

  abc one 1 NaN NaN two 2 NaN NaN three 8 5 4 

I have a row replacement dictionary for columns b and c. For instance:

 { 'one': [3.1, 2.2], 'two' : [8.8, 4.4] } 

where 3.1 and 8.8 replace columns b and 2.2 and 4.4 replace column c, so the result

  abc one 1 3.1 2.2 two 2 8.8 4.4 three 8 5 4 

I know how to make these changes to the for loop:

 index_list = ['one', 'two'] value_list_b = [3.1, 8.8] value_list_c = [2.2, 4.4] for i in range(len(index_list)): df_test.ix[df_test.index == index_list[i], 'b'] = value_list_b[i] df_test.ix[df_test.index == index_list[i], 'c'] = value_list_c[i] 

but I'm sure there is a more convenient and faster way to use the dictionary!

I think this can be done using the DataFrame.replace method, but I could not figure it out.

Thanks for the help,

CD

+6
source share
1 answer

You are looking for DataFrame.update . The only turning point in your case is that you specify updates as a row dictionary, whereas a DataFrame is usually created from a column dictionary. The orient keyword can handle this.

 In [25]: df_test Out[25]: abc one 1 NaN NaN two 2 NaN NaN three 8 5 4 In [26]: row_replacements = { 'one': [3.1, 2.2], 'two' : [8.8, 4.4] } In [27]: df_update = DataFrame.from_dict(row_replacements, orient='index') In [28]: df_update.columns = ['b', 'c'] In [29]: df_test.update(df_update) In [30]: df_test Out[30]: abc one 1 3.1 2.2 two 2 8.8 4.4 three 8 5.0 4.0 

from_dict is a concrete DataFrame constructor that gives us the orient keyword, not available if you just say DataFrame(...) . For reasons that I don’t know, we cannot pass the column names ['b', 'c'] to from_dict , so I pointed them out in a separate step.

+4
source

Source: https://habr.com/ru/post/949501/


All Articles