I am trying to parse a string containing a name and a degree. I have a long list of them. Some do not contain degrees, some contain one, and some contain several.
Examples of lines:
Sam da Man JD Green Eggs Jr. Ed.M. Argle Bargle Sr. MA Cersei Lannister MA Ph.D.
As far as I can tell, degrees are included in the following patterns:
xxxxxxxxx. x.xx. xx.xxxxx. two caps (ex: 'MA')
How would I make it out?
I am new to regex and this problem has proven to be very time consuming. I used this post and tried split = re.split('\s+|([.])',s) and split = re.split('\s+|\.',s) , but they were still divided into the first a place.
I thought, in response to the first comment, about assigning a degree. I am trying to create a regular expression that recognizes "xx" and then a wildcard, because there are several patterns in degrees that look like this: xx (something): xxxxxxxxx.
and then I will have a few more classifications.
Alternatively, classifying a name might be easier?
Or even listing the degrees in a collection and searching for them?
{'MAT','Ph.D.','MA','JD','Ed.M.', 'MA', 'MBA', 'Ed.S.', 'M.Div.', 'M.Ed.", 'RN', 'BSEd.'}