How to split a string using 2 separated parameters?

Example:

r="\\%4l\\%(wit.*wit\\)\\|\\%8l\\%(rood.*rood\\)\\|\\%12l\\%(blauw.*blauw\\)\\|\\%13l\\%(wit.*wit\\)\\|\\%14l\\%(blauw.*blauw\\)\\|\\%15l\\%(wit.*wit\\)\\|\\%16l\\%(wit.*wit\\)\\|\\%17l\\%(rood.*rood\\)\\|\\%19l\\%(wit.*wit\\)\\|\\%21l\\%(blauw.*blauw\\)"

I want to split a string into a list, but not using 1 parameter, but 2 parameters.

  • First I want to write a number to l\\%(
  • Secondly, I want to capture text between \\%(and \\)\\|or in the case of a line ending between \\%(and\\)$

Conclusion:

[[4, "wit.*wit"], [8, "rood.*rood"], [12, "blauw.*blauw"], [13, "wit.*wit"], [14, "blauw.*blauw"], [15, "wit.*wit"], [16,"wit.*wit"], [17, "rood.*rood"], [19, "wit.*wit"], [21, "blauw.*blauw"]]

I tried breaking the string on \\|and replacing the subscript character "".

Is there a better way to do this in Python?

+4
source share
2 answers

One way to get closer to it is to use it re.findall()with two capture groups to find the right pairs:

In [3]: re.findall(r"%(\d+)l\\%\((.*?)\\\)", r)
Out[3]: 
[('4', 'wit.*wit'),
 ('8', 'rood.*rood'),
 ('12', 'blauw.*blauw'),
 ('13', 'wit.*wit'),
 ('14', 'blauw.*blauw'),
 ('15', 'wit.*wit'),
 ('16', 'wit.*wit'),
 ('17', 'rood.*rood'),
 ('19', 'wit.*wit'),
 ('21', 'blauw.*blauw')]
+6
source

findall(), , .

:

string = r"\%4l\%(wit.*wit\)\|\%8l\%(rood.*rood\)\|\%12l\%(blauw.*blauw\)\|\%13l\%(wit.*wit\)\|\%14l\%(blauw.*blauw\)\|\%15l\%(wit.*wit\)\|\%16l\%(wit.*wit\)\|\%17l\%(rood.*rood\)\|\%19l\%(wit.*wit\)\|\%21l\%(blauw.*blauw\)"

pairs = [substring[2:-2].split(r"l\%(") for substring in string.split(r"\|")]
# [['4', 'wit.*wit'], ['8', 'rood.*rood'], ['12', 'blauw.*blauw'], ['13', 'wit.*wit'], ['14', 'blauw.*blauw'], ['15', 'wit.*wit'], ['16', 'wit.*wit'], ['17', 'rood.*rood'], ['19', 'wit.*wit'], ['21', 'blauw.*blauw']]
+2

Source: https://habr.com/ru/post/1668872/


All Articles