Regex to remove interest

Hi, I am very grateful for the help in creating a regex that removes a percentage from the end of the line:

Film name (2009) 58% -> Film name (2009) Film name (2010) 59% -> Film name (2010) 

A string may or may not have a year in square brackets. Before a year in parentheses, the name of the movie can be alphanumeric and have a few words.

I use 'bulk rename utility', so I want to fill in the match and replace fields.

The best I could come up with was:

 ([AZ][az]*) \((\d*)\) (\d*\%) --> \1 (\2) 

although it seemed to work only with single-word movie names and lost the brackets, so I had to add again!

I have google, and every time I try to use possible expressions, it does not work in the "mass rename utility", which, it seems to me, is based on pcre ( Bulk Rename Utility ).

+4
source share
5 answers

To avoid replacing the wrong things, do it

 \b(100|\d{1,2})%\b 

and do not replace anything.

It stops at word boundaries (i.e. 30% is ok, but w30% is not) and gets only 100 or 0-99 numbers.

EDIT:

If% is the last char of the string, you can get a better result when executed

 \b(100|\d{1,2})%$ 

this way you get only % at the end of the line, avoiding removing numbers with% from the movie name.

If the string is the file name and you need to replace it, and you cannot just delete part of the fragment, you can do it

 (.+?)(100|[0-9]{1,2})%$ #I think using 0-9 is accepted by more languages 

and replace with

 $1 

\1 and \2 should not be used in a replacement expression. They are regular expression patterns that match the first and second capture. $1 and $2 are variables that contain the coincidence of the first and second capture, so you should use them.

+2
source

It is very simple to do with

 s/\s*\d+%$// 

which deletes the ending digit string followed by a percent sign, along with any previous space characters

 use strict; use warnings; while (<DATA>) { s/\s*\d+%$//; print; } __DATA__ Film name (2009) 58% Film name (2010) 59% 

Output

 Film name (2009) Film name (2010) 
+4
source

I am not familiar with the utility, but in substitution, I usually replace [0-9]+% should not work with anything. However, be careful if there are percentages in their names!

+2
source

You are lucky that the percentage (if it exists) is always the last. Just use this as a key fact and don't try to match anything else. (As a rule, with RE, comparing materials that you are not going to change simply adds the chance that something will go wrong, without any advantages - do this only if you need to determine the location of your part with.)

My guess is that some of the previous answers were more or less correct, but no one worked because you had a typo in all of these "}" and ") '' | and '\' (regular expressions must be exact, backslash is not is a forward slash, the square bracket is not the curly bracket is not paired, plus is not a star, lowercase is not uppercase, you cannot add white space anywhere, etc.), and most of them did not work, because sometimes you have trailing spaces at the ends of your lines. th field "match" uses \s+(100|\d\d?)%\s*$ \ s * $
and your "replace" field should be completely empty.

(Another thought: is it possible that some of your data has a space between the digits and the percent sign [for example: foo bar (2012) 83%)? If so, change the match box to allow this probability \ S + (100 |? \ D \ d) \ S *% \ s * $

0
source

Here is my suggestion:

 ^([1-9]([0-9])*?|0)(\.[0-9]+)?%?$ 

Corresponds to "12", "0.123", "12.44", "102.12345", as well as with% at the end of "11.22%", "11%" ....

Corresponds to a percentage with any number of digits before and after the decimal point and with the symbol “%” at the end (dot and% are optional, of course).

Hope this helps;)

0
source

Source: https://habr.com/ru/post/1437027/


All Articles