How about creating a revenue dictionary based on ageranges and then displaying a random selection ie
#Based on unutbu data df1 = pd.DataFrame({'agerange': [2, 2, 4, 4, 3, 3, 4], 'gender': [1, 0, 0, 0, 0, 0, 0], 'income': [56700, 25600, 3000, 106000, 200, 43000, 10000000], 'index': [0, 1, 2, 3, 4, 5, 6]}) df2 = pd.DataFrame({'agerange': [3, 2, 4, 4], 'gender': [0, 0, 0, 0], 'index': [0, 1, 2, 3]}) age_groups = df1.groupby('agerange')['income'].agg(lambda x: tuple(x)).to_dict() df2['income'] = df2['agerange'].map(lambda x: np.random.choice(age_groups[x]))
Output:
agerange gender index income
0 3 0 0 43,000
1 2 0 1 25600
2 4 0 2 106000
3 4 0 3 106000
If a gender group is also required, you can use the application if you want to fill in 0 for keys that were not found, if you can still use ie
df2 = pd.DataFrame({'agerange': [3, 2, 6, 4], 'gender': [0, 0, 0, 0], 'index': [0, 1, 2, 3]}) df1 = pd.DataFrame({'agerange': [2, 2, 4, 4, 3, 3, 4], 'gender': [1, 0, 0, 0, 0, 0, 0], 'income': [56700, 25600, 3000, 106000, 200, 43000, 10000000], 'index': [0, 1, 2, 3, 4, 5, 6]}) age_groups = df1.groupby(['agerange','gender'])['income'].agg(lambda x: tuple(x)).to_dict() df2['income'] = df2.apply(lambda x: np.random.choice(age_groups[x['agerange'],x['gender']]) if (x['agerange'],x['gender']) in age_groups else 0,axis=1)
Output:
agerange gender index income
0 3 0 0 43,000
1 2 0 1 25600
2 6 0 2 0
3 4 0 3 106000