Regex matches 0 or more keyword instances in any order

I am decent with regular expressions, but here is strong. Below is the problem with group 2 . However, I think this should be pretty easy for regular expression gurus ...

Problem

I am trying to match zero or more instances of a set of keywords, in any order


[Update: for future reference]
The simplest solution (derived from black panda ): ((keyword1 | keyword2 | keyword3 )*)

Note: the space after each word is significant!

In my case, this is translated into:
((static |final )*)

This is the easiest answer. A better, more efficient approach is in the black panda answer below. It allows you to use any number of spaces and faster to process the RE engine.


Enter

I need to break up the following input into very specific groups.

Note: numbers are not part of the input. That is, each line of input begins with the letter p.

  • public static final int ONE = 1;
  • public final static int TWO = 2;
  • public final int THREE = 3;
  • public static int FOUR = 4;
  • private int FIVE = 5;

Group

I need to split the input into match groups so that

group 1 = public or private or protected
group 2 = 0 or more instances of "static" or "final" <- the group I'm struggling with
group 3 = data type
group 4 = variable name
group 5 = value

Group Information 2

Given the above data, group 2 will be as follows:

  • static final
  • final static
  • the ultimate
  • static
  • [empty line]

Failed solutions

this is the regex I came up with and id is not working for group 2:

 ^.*(public|private|protected)\s+(static\s+|final\s+)*\s+([^ ]+)\s+([^ ]+)\s*(;|=)(.*)$ 

for group 2, I tried:

  • (static \ s + | last \ S +) *
  • (static | end) * \ s +
  • (static | final) *
  • (static \ | final \) *

Summary

What should be the regular expression for "group 2" that matches one or more instances of the words "static" or "final". The correct solution will be extensible to match any subset of any words, such as [static, finite, transitional, unstable].

+4
source share
4 answers

Can you capture everything in between and make sure there are groups of 3 or more?

group 2 = ((?:(?:static|final|transient|volatile)\s+)*)

+2
source

You can try:

 ^(?!.*\bstatic\s+static\b)(?!.*\bfinal\s+final\b).*(public|private|protected)\s+(static\s+|final\s+)?(static\s+|final\s+)?(\S+)\s+(\S+)\s*(;|=.*)$ 

Take a look

0
source

This matches zero or more instances of the words "static" or "final":

 (static|final)* 

As you can see from these perl fragments:

 perl -e '$_ = "static final"; print $1 if /(static|final)*/;' # prints "static" perl -e '$_ = ""; print "matched" if /(static|final)*/;' 

If your matches do not work, the problem may be elsewhere

0
source

What about:

 #!/usr/bin/perl use strict; use warnings; use Data::Dump qw(dump); while(<DATA>) { my @l = $_ =~ /^\s*(public|private|protected)\s+((?:static\s+|final\s+)*)\s*(\S+)\s+(\S+)(?:\s+=\s*(.*))?\s*;\s*$/; dump@l ; } __DATA__ public static final int ONE = 1; public final static int TWO = 2; public final int THREE = 3; public static int FOUR = 4; private int FIVE = 5; 

output:

 ("public", "static final", "int", "ONE", 1) ("public", "final static", "int", "TWO", 2) ("public", "final", "int", "THREE", 3) ("public", "static", "int", "FOUR", 4) ("private", "", "int", "FIVE", 5) 
0
source

Source: https://habr.com/ru/post/1386553/


All Articles