This question is related to my previous question . Given the following data block:
df =
ID TYPE VD_0 VD_1 VD_2 VD_3 VD_4 VD_5
1 ABC V1234 aaa bbb 456 123 564
2 DBC 456 A45 aaa V1234 bbb 564
3 ABD 456 V1234 bbb ccc 456 123
4 ABD ccc aaa 123 V1234 SSW 123
The following is a list of target values VD_0- VD_5:
myList = [V1234,456,A45]
I want to get only those rows in dfwhich are 2 or more "sequential" values the appearance of myListthe columns VD_0- VD_5, but is allowed to have any other significance between them (any others who do not belong to myList). For example, these values may be permissible aaa, bbb, cccetc.
The result should be as follows:
result =
ID TYPE Col_0 Col_1 Col_2
1 ABC V1234 456
2 DBC 456 A45 V1234
3 ABD 456 V1234 456
In resultI want to display only the values from myListin the columns Col_, ignoring the rest of the values.
, , , myList:
subset = df.filter(like='VD_')
df[subset.isin(myList).rolling(2, axis=1).sum().max(axis=1)>=2]
.