The problem is that yours is a Unicode character. When in str, it actually behaves like a few characters:
>>> print len('—')
3
But if you use unicodeinstead str:
>>> print len(u'—')
1
So the following will print True:
def learn_re(s):
pattern=re.compile("[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3} . C")
if pattern.match(s):
return True
return False
print learn_re(u"01:01:01.123 — C")
, python 2. python 3, str unicode str, .