You can use replace+ groupby+ GroupBy.max+ replace+ reset_index:
df1 = df.replace({'Y':1,'N':0, np.nan:-1})
.groupby('id')
.max()
.replace({1:'Y', 0:'N',-1:np.nan})
.reset_index()
print (df1)
id var1 var2 var3
0 1 Y N Y
1 2 Y Y Y
EDIT:
df = pd.DataFrame({
'id': [1, 1, 2, 2, 3, 3],
'var2': ['N', 'N', 'N', 'Y', 'N', np.nan],
'var1': ['Y', 'Y', 'Y', 'N', 'Y', np.nan],
'var3': [np.nan, np.nan, np.nan, 'N', np.nan, 'Y']
})
print (df)
id var1 var2 var3
0 1 Y N NaN
1 1 Y N NaN
2 2 Y N NaN
3 2 N Y N
4 3 Y N NaN
5 3 NaN NaN Y
Dynamically, you can create another one dict:
print (df.set_index('id').stack(dropna=False).unique())
['Y' 'N' nan]
d = {'Y':1,'N':0, np.nan:-1}
d1 = {v: k for k, v in d.items()}
df1 = df.replace(d).groupby('id').max().replace(d1).reset_index()
print (df1)
id var1 var2 var3
0 1 Y N NaN
1 2 Y Y N
2 3 Y N Y
EDIT1:
The solution, if only Y, Nand NaNin the columns var1is varN:
varNAN = 'A'
print (df.fillna(varNAN).groupby('id').max().replace({varNAN:np.nan}).reset_index())
id var1 var2 var3
0 1 Y N NaN
1 2 Y Y N
2 3 Y N Y