Split a string of a specific pattern into three parts

I am given a string that has this pattern:

[blah blah blah] [more blah] some text 

I want to break the string into three parts: blah blah blah , more blah and some text .

A rough way to do this is to use mystr.split('] ') and then remove the output [ from the first two elements. Is there a better and more efficient way (you need to do this for thousands of lines very quickly).

+4
source share
2 answers

You can use a regular expression to extract text if you know that it will be in this form. For efficiency, you can precompile the regex and then reuse it when matching.

 prog = re.compile('\[([^\]]*)\]\s*\[([^\]]*)\]\s*(.*)') for mystr in string_list: result = prog.match(mystr) groups = result.groups() 

If you need an explanation of the regular expression itself, you can get it with this tool .

+5
source

You can use regex to separate in which you want to leave characters:

 >>> import re >>> s = '[...] [...] ...' >>> re.split(r'\[|\] *\[?', s)[1:] ['...', '...', '...'] 
+1
source

Source: https://habr.com/ru/post/1482264/


All Articles