Python regex to match only first instance

I have python code and I read the certificate and only map the root certificate. For example, my certificate looks like this:

--------begin certificate--------
CZImiZPyLGQBGRYFbG9jYWwxGjAYBgoJkiaJk/IasdasdassZAEZFgp2aXJ0dWFsdnB4MSEw
HwYDVQQDExh2aXJ0dWFsdnB4LVZJUlRVQUxEQzEtQ0EwHhfdgdgdgfcNMTUwOTE2MTg1MTMx
WhcNMTcwOTE2MTkwMTMxWjBaMQswCQYDVQQGEwJVUzEXMBUGCgmSJoaeqasadsmT8ixkARkW
B3ZzcGhlcmUxFTATBgoJkiaJk/IsZAEZFgVsb2NhbDEOMAwGA1UEChMFdmNlcnfrrfgfdvQx
CzAJBgNVBAMTAkNBMIIBIjANBgkqhkiG9w
--------end certificate----------
--------begin certificate--------
ZGFwOi8vL0NOPXZpcnR1YWx2cHgtcvxcvxvVklSVFVBTERDMS1DQSxDTj1BSUEsQ049UHVi
bGljJTIwS2V5JTIwU2VydmldfsfhjZXMsQ049U2VydmfffljZXMsQ049Q29uZmlndXJhdGlv
bixEQz12aXJ0dWFsdnB4LERDPWxvY2FsP2NxvxcvxcvBQ2VydGlmaWNhdGU/YmFzZT9vYmpl
Y3RDbGFzcz1jZXJ0aWZpY2F0aW9uQXV0dsfsdffraG9yaXR5MD0GCSsGAQQBgjcVBwQwMC4G
--------end certificate----------

I want to get only the root certificate that starts with CZImiZPy. I read the certificate in the variable data and applied the following regular expression

re.sub('-----.*?-----', '', data)

But he took encrypted certificates, and not just the first one. Is there a better way to tweak the correct expression?

+4
source share
2 answers

You want to search for text, not replace it with something else.

>>> import re
>>> s = """--------begin certificate--------
<certificate encrypted>
--------end certificate----------
--------begin certificate--------
<certificate encrypted>
--------end certificate----------"""
>>> re.search(r"-+begin certificate-+\s+(.*?)\s+-+end certificate-+", s, flags=re.DOTALL).group(1)
'<certificate encrypted>'

Explanation:

-+begin certificate-+ # Match the starting label
\s+                   # Match whitespace (including linebreaks)
(.*?)                 # Match any number of any character. Capture the result in group 1
\s+                   # Match whitespace (including linebreaks)
-+end certificate-+   # Match the ending label

re.search() will always return the first match.

+2
source

re.sub can get the count variable as a parameter:

re.sub(pattern, repl, string, count=0, flags=0)

count - , .

, :

re.sub('-----.*?-----', '', data, 1)

, , re.sub. re, , .

+5

Source: https://habr.com/ru/post/1613166/


All Articles