Recursive nested expression in Python

I am using Python 2.6.4.

I have a number of select statements in a text file, and I need to extract the field names from each select query. This would be easy if some of the fields did not use nested functions like to_char (), etc.

The given fields of the query operator can have several nested parenthese, such as "ltrim (rtrim (to_char (base_field_name, format))) renamed_field_name" or a simple case only "base_field_name" as a field, can I use the Python re module to write a regular expression to extract base_field_name? If so, what does the regex look like?

+3
source share
6 answers
>>> import re
>>> string = 'ltrim(rtrim(to_char(base_field_name, format))) renamed_field_name'
>>> rx = re.compile('^(.*?\()*(.+?)(,.*?)*(,|\).*?)*$')
>>> rx.search(string).group(2)
'base_field_name'
>>> rx.search('base_field_name').group(2)
'base_field_name'
+2
source

Regular expressions are not suitable for parsing "nested" structures. Instead, try a full-fledged set of parsing, for example pyparsing - you can find examples of using pyparsing specifically for SQL parsing here and here , for example (you certainly need to take examples as a starting point and write your own custom syntax code, but it's definitely not too complicated )

+11
source

, , , . .

+2

:

import re
print re.match(r".*\(([^\)]+)\)", "ltrim(to_char(field_name, format)))").group(1)

. , .

.*(\w+)\(([^\)]+)\)
+1

, , .

, 'eval' , , (, , , ).

class FakeFunction(object):
    def __init__(self, name):
        self.name = name
    def __call__(self, *args):
        return args[0]
    def __str__(self):
        return self.name

class FakeGlobals(dict):
    def __getitem__(self, x):
        return FakeFunction(x)

def ExtractBaseFieldName(x):
    return eval(x, FakeGlobals())

print ExtractBaseFieldName('ltrim(rtrim(to_char(base_field_name, format)))')
+1

? , ,

  s[s.rfind('(')+1:s.find(')')].split(',')[0]

's', .

, , ...

0

Source: https://habr.com/ru/post/1730806/


All Articles