How to remove text in parentheses using regular expression?

I am trying to process a lot of files, and I need to modify them to remove extraneous data in the file names; in particular, I'm trying to remove text in parentheses. For example:

filename = "Example_file_(extra_descriptor).ext" 

and I want to re-select a whole bunch of files where the expression in brackets can be in the middle or at the end and of variable length.

What will the regular expression look like? The preferred syntax is Perl or Python.

+54
python regex perl
Mar 12 '09 at 18:56
source share
9 answers
 s/\([^)]*\)// 

So in Python you would do:

 re.sub(r'\([^)]*\)', '', filename) 
+90
Mar 12 '09 at 18:59
source share

A pattern that matches substrings in parentheses between which there are no other ( ) characters (e.g., (xyz 123) in Text (abc(xyz 123) ),

 \([^()]*\) 

Details :

Removing code snippets:

  • JavaScript : string.replace(/\([^()]*\)/g, '')
  • PHP : preg_replace('~\([^()]*\)~', '', $string)
  • Perl : $s =~ s/\([^()]*\)//g
  • Python : re.sub(r'\([^()]*\)', '', s)
  • C # : Regex.Replace(str, @"\([^()]*\)", string.Empty)
  • VB.NET : Regex.Replace(str, "\([^()]*\)", "")
  • Java : s.replaceAll("\\([^()]*\\)", "")
  • Ruby : s.gsub(/\([^()]*\)/, '')
  • R : gsub("\\([^()]*\\)", "", x)
  • Lua : string.gsub(s, "%([^()]*%)", "")
  • Bash / sed : sed 's/([^()]*)//g'
  • Tcl : regsub -all {\([^()]*\)} $s "" result
  • C ++ std::regex : std::regex_replace(s, std::regex(R"(\([^()]*\))"), "")
  • Goal-C :
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"\\([^()]*\\)" options:NSRegularExpressionCaseInsensitive error:&error]; NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:@""];
  • Swift : s.replacingOccurrences(of: "\\([^()]*\\)", with: "", options: [.regularExpression])
+27
Nov 15 '16 at 23:07
source share

I would use:

 \([^)]*\) 
+21
Mar 12 '09 at 19:08
source share

If you do not absolutely need to use regex, use consider using Perl Text :: Balanced to remove the brackets.

 use Text::Balanced qw(extract_bracketed); my ($extracted, $remainder, $prefix) = extract_bracketed( $filename, '()', '[^(]*' ); { no warnings 'uninitialized'; $filename = (defined $prefix or defined $remainder) ? $prefix . $remainder : $extracted; } 

You might be thinking, "Why all this when a regexp does the trick on one line?"

 $filename =~ s/\([^}]*\)//; 

Text :: Balanced processes nested parentheses. Thus, $filename = 'foo_(bar(baz)buz)).foo' will be extracted correctly. The regex solutions offered here will fail on this line. One will stop at the first close, and the other will eat them all.

$ filename = ~ s / ([^}] *) //; # returns' foo_buz)). foo '

$ filename = ~ s /(.*)//; # returns 'foo_.foo'

# text balanced example returns' foo _). foo '

If any of the regular expression behaviors is acceptable, use a regular expression - but document the limitations and assumptions made.

+6
Mar 12 '09 at 22:55
source share

If you can use sed (maybe execute from your program, it will be as simple as:

 sed 's/(.*)//g' 
+2
Mar 12 '09 at 19:03
source share

If the path can contain parentheses, then the regular expression r'\(.*?\)' not enough:

 import os, re def remove_parenthesized_chunks(path, safeext=True, safedir=True): dirpath, basename = os.path.split(path) if safedir else ('', path) name, ext = os.path.splitext(basename) if safeext else (basename, '') name = re.sub(r'\(.*?\)', '', name) return os.path.join(dirpath, name+ext) 

By default, the function stores brackets in brackets in directories and parts of the path extension.

Example:

 >>> f = remove_parenthesized_chunks >>> f("Example_file_(extra_descriptor).ext") 'Example_file_.ext' >>> path = r"c:\dir_(important)\example(extra).ext(untouchable)" >>> f(path) 'c:\\dir_(important)\\example.ext(untouchable)' >>> f(path, safeext=False) 'c:\\dir_(important)\\example.ext' >>> f(path, safedir=False) 'c:\\dir_\\example.ext(untouchable)' >>> f(path, False, False) 'c:\\dir_\\example.ext' >>> f(r"c:\(extra)\example(extra).ext", safedir=False) 'c:\\\\example.ext' 
+2
Mar 12 '09 at 20:03
source share

For those who want to use Python, here is a simple procedure that removes substrings in brackets, including those with parentheses enclosed. Well, this is not a regular expression, but he will do the job!

 def remove_nested_parens(input_str): """Returns a copy of 'input_str' with any parenthesized text removed. Nested parentheses are handled.""" result = '' paren_level = 0 for ch in input_str: if ch == '(': paren_level += 1 elif (ch == ')') and paren_level: paren_level -= 1 elif not paren_level: result += ch return result remove_nested_parens('example_(extra(qualifier)_text)_test(more_parens).ext') 
+2
Dec 14 '17 at 22:30
source share
 >>> import re >>> filename = "Example_file_(extra_descriptor).ext" >>> p = re.compile(r'\([^)]*\)') >>> re.sub(p, '', filename) 'Example_file_.ext' 
0
Mar 12 '09 at 21:48
source share

Java Code:

 Pattern pattern1 = Pattern.compile("(\\_\\(.*?\\))"); System.out.println(fileName.replace(matcher1.group(1), "")); 
0
Aug 03 '12 at 9:30
source share



All Articles