Python: pandas apply to map

I'm trying to figure out exactly how it works df.apply().

My problem is this: I have a dataframe df. Now I want to search in multiple columns for specific rows. If the row is found in any of the columns that I want to add for each row, where the row is found "label" (in a new column).

I can solve the problem with mapand applymap(see below).

However, I would expect a better solution would be to use apply, as it will apply the function to the entire column.

Question : Unable to use apply? Where is my mistake?

Here are my solutions for using mapand applymap.

df = pd.DataFrame([list("ABCDZ"),list("EAGHY"), list("IJKLA")], columns = ["h1","h2","h3","h4", "h5"])

Solution using map

def setlabel_func(column):
    return df[column].str.contains("A")

mask = sum(map(setlabel_func, ["h1","h5"]))
df.ix[mask==1,"New Column"] = "Label"

Solution using applymap

mask = df[["h1","h5"]].applymap(lambda el: True if re.match("A",el) else False).T.any()
df.ix[mask == True, "New Column"] = "Label"

apply , /, , ; -)

def setlabel_func(column):
    return df[column].str.contains("A")

df.apply(setlabel_func(["h1","h5"]),axis = 1)

.

DataFrame 'str'

? , , .str.contain .

+4
3

DataFrame.any True :

print (df[['h1', 'h5']].apply(lambda x: x.str.contains('A')))
      h1     h5
0   True  False
1  False  False
2  False   True

print (df[['h1', 'h5']].apply(lambda x: x.str.contains('A')).any(1))
0     True
1    False
2     True
dtype: bool

df['new'] = np.where(df[['h1','h5']].apply(lambda x: x.str.contains('A')).any(1),
                     'Label', '')

print (df)
  h1 h2 h3 h4 h5    new
0  A  B  C  D  Z  Label
1  E  A  G  H  Y       
2  I  J  K  L  A  Label

mask = df[['h1', 'h5']].apply(lambda x: x.str.contains('A')).any(1)
df.loc[mask, 'New'] = 'Label'
print (df)
  h1 h2 h3 h4 h5    New
0  A  B  C  D  Z  Label
1  E  A  G  H  Y    NaN
2  I  J  K  L  A  Label
+5

IIUC :

In [23]: df['new'] = np.where(df[['h1','h5']].apply(lambda x: x.str.contains('A'))
                                             .sum(1) > 0,
                              'Label', '')

In [24]: df
Out[24]:
  h1 h2 h3 h4 h5    new
0  A  B  C  D  Z  Label
1  E  A  G  H  Y
2  I  J  K  L  A  Label
+3

pd.DataFrame.apply , pd.Series . , , apply

,

mask = df[['h1', 'h5']].apply(lambda x: x.str.contains('A').any(), 1)
df.loc[mask, 'New Column'] = 'Label'

  h1 h2 h3 h4 h5 New Column
0  A  B  C  D  Z      Label
1  E  A  G  H  Y        NaN
2  I  J  K  L  A      Label

+3

Source: https://habr.com/ru/post/1669452/


All Articles