Regex findall start () and end ()? python

I am trying to get the start and end position of a request in a sequence using re.findall

import re sequence = 'aaabbbaaacccdddeeefff' query = 'aaa' findall = re.findall(query,sequence) >>> ['aaa','aaa'] 

How to get something like findall.start () or findall.end ()?

I would like to get

 start = [0,6] end = [2,8] 

I know that

 search = re.search(query,sequence) print search.start(),search.end() >>> 0,2 

would give me only the first instance

+4
source share
3 answers

Use re.finditer :

 >>> import re >>> sequence = 'aaabbbaaacccdddeeefff' >>> query = 'aaa' >>> r = re.compile(query) >>> [[m.start(),m.end()] for m in r.finditer(sequence)] [[0, 3], [6, 9]] 

From the docs:

Return an iterator , giving MatchObject instances over all non-overlapping matches for the RE pattern in the string. The string is scanned from left to right, and matches are returned in the order found.

+8
source

You can not. findall is a convenience function that, like docs , returns a "list of strings". If you need a MatchObject s list, you cannot use findall .

However, you can use finditer . If you just repeat the matches for match in re.findall(…): you can use for match in re.finditer(…) same way - except for MatchObject values ​​instead of strings. If you really need a list, just use matches = list(re.finditer(…)) .

+3
source

Use finditer instead of findall. This returns you an iterator with MatchObject instances, and you can get the start / end from the MatchObject.

+1
source

Source: https://habr.com/ru/post/1491022/


All Articles