I have a regex that matches all three characters in a string:
\b[^\s]{3}\b
When I use it with a string:
And the tiger attacked you.
this is the result:
regex = re.compile("\b[^\s]{3}\b") regex.findall(string) [u'And', u'the', u'you']
As you can see, this matches you as a three-character word, but I want the expression to accept "you." with "." like a word 4 characters.
I have the same problem with ",", ";", ":" etc.
I am new to regex, but I guess this happens because these characters are treated as word boundaries.
Is there any way to do this?
Thanks in advance,
EDIT
Thanks for the answers @BrenBarn and @Kendall Frey I managed to find the regex that I was looking for:
(?<!\w)[^\s]{3}(?=$|\s)
source share