Let's say I have:
s = 'white male, 2 white females'
And I want to "deploy" this:
'white male, white female, white female'
A more complete list of cases:
- 'two Hispanic men, two Hispanic women,
- โ 'Hispanic man, Hispanic man, Hispanic woman, Hispanic woman
- '2 black males, white male'
- โ "black man, black man, white man"
It looks like I'm close with:
import re
mult = re.compile('two|2 (?P<race>[a-z]+) (?P<gender>(?:fe)?male)s')
s = 'white male, 2 white females'
mult.sub(r'\g<race> \g<gender>, \g<race> \g<gender>', s)
s = 'two hispanic males, 2 hispanic females'
mult.sub(r'\g<race> \g<gender>, \g<race> \g<gender>', s)
What creates a trigger in the second case?
Bonus question: Is there a pandas' Series method that implements this function directly instead of using it Series.apply()
? Sorry to review my question and spend any time here.
For example, on:
s = pd.Series(
['white male',
'white male, white female',
'hispanic male, 2 hispanic females',
'black male, 2 white females'])
Is there a faster route than:
s.apply(lambda x: mult.sub(..., x))