Most efficient way to match dict dict based on two pandas columns

I have the following problem: I would like to match dict dictbased on two columns in pandas dataframe. However, the only solution I've come up with so far is use apply. The problem is that my dataframe has over a million rows, so usage applycan be long. Any ideas on how to do this more efficiently? Here is my code:

import pandas as pd
import numpy as np

dict_dict = {'A': {'a': 1, 'b': 2, 'c': 3},
             'B': {'a': 4, 'b': 5, 'c': 6},
             'C': {'a': 7, 'b': 8, 'c': 9},
             'D': {'a': 10, 'b': 11, 'c': 12}}

list1 = ['A', 'B', 'C']
list2 = ['a', 'b', 'c']

np.random.seed(100)

df = pd.DataFrame()
df['col1'] = np.random.choice(list1, 10)
df['col2'] = np.random.choice(list2, 10)

df['map'] = df.apply(lambda x: dict_dict[x.col1][x.col2], axis=1)

df

  col1 col2  map
0    A    c    3
1    A    c    3
2    A    b    2
3    C    a    7
4    C    a    7
5    A    a    1
6    C    a    7
7    B    c    6
8    C    a    7
9    C    b    8
+4
source share
1 answer

You can build a DataFrame from dict_dictand use merge:

# Construct a DataFrame from dict_dict
df2 = pd.DataFrame(dict_dict).stack().rename('map').to_frame()

# Perform a merge.
df = df.merge(df2, how='left', left_on=['col2', 'col1'], right_index=True)

Result:

  col1 col2  map
0    A    c    3
1    A    c    3
2    A    b    2
3    C    a    7
4    C    a    7
5    A    a    1
6    C    a    7
7    B    c    6
8    C    a    7
9    C    b    8
+5
source

Source: https://habr.com/ru/post/1677561/


All Articles