Java - regular expression to get format number

I have it:

  • 110 121 NATURAL 95 1570.40
  • 110 121 NATURAL 95 1570.40 *
  • 41.110 1 x 38.20 CZK) [A] *
  • '31, 831 261.791 1308.61)
  • > 01572 PRAVO SO 17.00
  • 1000 ks x 17.00
  • 1570.40

Each line of this output is stored in the list, and I want to get the number 1570.40

My regular expressions look like this for this type of format

"([1-9][0-9]*[\\.|,][0-9]{2})[^\\.\\d](.*)" "^([1-9][0-9]*[\\.|,][0-9]{2})$" 

I have a problem: 1570.40 on the last line, if it is based (by the second regular expression), also 1570.40 (from line 1570.40 * at the end), but the first line is not based .. do you know where the problem is?

+6
source share
3 answers

I'm not sure that I understand your needs well, but I think you could use word boundaries, for example:

 \b([1-9]\d*[.,]\d{2})\b 

To not match dates, you can use:

 (?:^|[^.,\d])(\d+[,.]\d\d)(?:[^.,\d]|$) 

Explanation:

 The regular expression: (?-imsx:(?:^|[^.,\d])(\d+[,.]\d\d)(?:[^.,\d]|$)) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- [^.,\d] any character except: '.', ',', digits (0-9) ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \d+ digits (0-9) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [,.] any character of: ',', '.' ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- [^.,\d] any character except: '.', ',', digits (0-9) ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- 
+1
source

Try the following:

 String s = "41,110 1 x 38,20 CZK)[A] * "; Matcher m = Pattern.compile("\\d+,\\d+").matcher(s); while(m.find()) { System.out.println(m.group()); } 
0
source

"([1-9][0-9]*[\\.|,][0-9]{2})[^\\.\\d](.*)" Has [^\\.\\d] , this means that he expects that after the number one character will be displayed without a digit. The second line has * , which matches it. The first line has a number at the end of the line, so nothing matches. I think you need only one regular expression that will capture all numbers: [^.\\d]*([1-9][0-9]*[.,][0-9]{2})[^.\\d]* . In addition, you should use find instead of match to find any substring in a string instead of matching the entire string. In addition, maybe it makes sense to find all the matches if there are two such numbers in the line, I'm not sure if this is the case for you or not.

Also use either [0-9] or \d . At the moment this is confusing - it means the same thing, but it looks different.

0
source

Source: https://habr.com/ru/post/943433/


All Articles