Regular expression to match function name and all arguments in Python

Suppose I have a line such as:

"func(arg1, arg2, arg3, arg4, ..., argn)" 

EDIT: This function is not in a specific language. It just has this format. If this simplifies, do not think of it as a function call, just a string.

I want to write a regular expression to match a function and each argument. I am writing this in Python. Desired Result:

 {"function" : "func", "arg" : ["arg1", "arg2", ... , "argn"]} 

EDIT: Although arguments can be functional calls, I can easily recursively try to match them with the same regex when I create one that works. By this, I mean that I can return a function with each of the arguments. But this is not very relevant. I am not , trying to create an interpreter, just something to recognize the arguments.

Here is my attempt:

 import re s = "func(arg1, arg2, arg3, arg4, argn)" m = re.match(r"(?P<function>\w+)\s?\((?P<args>(?P<arg>\w+(,\s?)?)+)\)", s) print m.groupdict() 

And here is the conclusion:

 {'function': 'func', 'args': 'arg1, arg2, arg3, arg4, argn', 'arg': 'argn'} 

A function matches only a penalty, as does a set of arguments. However, it seems that I cannot match individual arguments. Is this a problem with my regex, or a Python regex matching restriction?

EDIT2: I know that now I can separate the arguments using the following code:

 d["arg"] = d["args"].split(", ") 

But I was wondering if I can do all the work with regular expressions. In particular, I wonder why "arg" matches only the last argument.

EDIT3: I suppose I was (1) hoping to find out why Python only matches the last argument each time, and (2) can I do Python-style pattern matching in Python. Or if Python has something as intuitive as pattern matching in Scheme style. I looked at the ast module, and its syntax is prohibitively complex.

+6
source share
3 answers

It looks like you're 90% there, why not just change the arg and args groups and do:

 import re fn_match = re.match(r"(?P<function>\w+)\s?\((?P<arg>(?P<args>\w+(,\s?)?)+)\)", s) fn_dict = fn_match.groupdict() del fn_dict['args'] fn_dict['arg'] = [arg.strip() for arg in fn_dict['arg'].split(',')] 
+4
source

Regular expressions cannot parse complex programming languages.

If you're just trying to parse Python, I suggest taking a look at the ast module, which will analyze it for you.

+7
source

To answer the last part of your question: No. Python has nothing like a β€œMatch” scheme, and it doesn't have a template like ML / Haskell. The closest he has is the ability to destroy such things.

 >>> (a, [b, c, (d, e)]) = (1, [9, 4, (45, 8)]) >>> e 8 

And to extract the head and tail of the list (in Python 3.x) like this ...

 >>> head, *tail = [1,2,3,4,5] >>> tail [2, 3, 4, 5] 

There are some modules floating around that actually do pattern matching in python, but I can't vouch for their quality.

If I had to do this, I would implement it a little differently - perhaps you have the opportunity to enter the type and optional arguments (for example, length or exact content) and a function to call if it matches, ([list, length = 3, check = (3, str), func]), and this will match (list _ _ somestr) and calling func with somestr in scope, and you can also add more templates.

+1
source

Source: https://habr.com/ru/post/913257/


All Articles