I have a text that is a sentence, some of which are questions. I am trying to create a regular expression that will only retrieve questions that contain a specific phrase, namely "NSF":
import re
s = "This is a string. Is this a question? This isn't a question about NSF. Is this one about NSF? This one is a question about NSF but is it longer?"
Ideally, re.findall will return:
['Is this one about NSF?','This one is a question about NSF but is it longer?']
but my best attempt:
re.findall('([\.\?].*?NSF.*\?)+?',s)
[". Is this a question? This isn't a question about NSF. Is this one about NSF? This one is a question about NSF but is it longer?"]
I know that I need to do something with the non-greedy, but I'm not sure where I got confused.
source
share