How to make a group for each word in a sentence?

This may be a stupid question, but ...

Let's say you have a sentence like:

Fast brown fox

Or you can get a suggestion like:

A quick brown fox jumped over a lazy dog

A simple regular expression (\ w *) finds the first word "The" and puts it in a group.

For the first sentence, you can write (\ w *) \ s * (\ w *) \ s * (\ w *) \ s * (\ w *) \ s * to each group word, but it is assumed that you know number of words in a sentence.

Is it possible to write a regular expression that puts each word in any arbitrary sentence in its own group? It would be nice if you could do something like (?: (\ W *) \ s *) * so that it groups each instance (\ w *), but that doesn't work.

I do this in Python, and my use case is obviously a little more complicated than the Fast Brown Fox, so it would be nice if Regex could do it on one line, but if that is not possible, I guess The next best solution is to iterate over all matches using re.findall () or something similar.

Thank you for your understanding.

Edit: For completeness, my actual use case is used here and how I solved this with your help. Thanks again.

>>> s = '1 0 5 test1 5 test2 5 test3 5 test4 5 test5'
>>> s = re.match(r'^\d+\s\d+\s?(.*)', s).group(1)
>>> print s
5 test1 5 test2 5 test3 5 test4 5 test5
>>> list = re.findall(r'\d+\s(\w+)', s)
>>> print list
['test1', 'test2', 'test3', 'test4', 'test5']
+3
source share
4 answers

, . ... , '((\ w +)\s +) {0,99}', ... .

split, , , , .

re.split, '\ s' . , , "\ s +", .

>>> import re
>>> help(re.split)
Help on function split in module re:

split(pattern, string, maxsplit=0)
    Split the source string by the occurrences of the pattern,
    returning a list containing the resulting substrings.

>>> re.split('\s+', 'The   quick brown\t fox')
['The', 'quick', 'brown', 'fox']
>>>
+5

findall re

import re
>>> re.findall("\w+", "The quick brown fox")
['The', 'quick', 'brown', 'fox']
+6

, string.split ?

>>> "The quick brown fox".split()
['The', 'quick', 'brown', 'fox']
+3

Regular expressions cannot be grouped into an unknown number of groups. But in your case there is hope. Look at the split method, it should help in your case.

+1
source

Source: https://habr.com/ru/post/1753635/


All Articles