Python 3 regular expression for $ but not $$ in string

I need to match one of the following on a line:

${aa:bb[99]}
${aa:bb}
${aa}

but not:

$${aa:bb[99]}
$${aa:bb}
$${aa}

my python 3 regex:

pattern = **r"[^\$|/^]**\$\{(?P<section>[a-zA-Z]+?\:)?(?P<key>[a-zA-Z]+?)(?P<value>\[[0-9]+\])?\}"

What I'm looking for is the correct way to say no $ or the beginning of a line. The block r"[^\$|/^]"will correctly determine all cases, but will not work if my line starts with the first character.

I'm not sure,

r"[^\$|\b]... 
r"[^\$|\B]...
r"[^\$]...
r"[^\$|^] 

Any suggestion?

+4
source share
3 answers

Use a negative lookbehind:

(?<!\$)

and then follow it with what you really want to combine. This ensures that for the thing that you really want to match does not precede $(i.e. does not precede a match for \$):

(?<!\$)\$\{(?P<section>[a-zA-Z]+?\:)?(?P<key>[a-zA-Z]+?)(?P<value>\[[0-9]+\])?\}
     ^  ^
     |  |
     |  +--- The dollar sign you actually want to match
     |
     +--- The possible second preceding dollar sign you want to exclude

(?<!...)

, .... . lookbehind, . , lookbehind .

https://docs.python.org/3/library/re.html

+6

lookbehind (?<!\$), " $":

(?<!\$)\${[^}]*}

, " $".

regex101.

+2

. , , . https://regex101.com/r/G2n0cO/1/. , :

(?:^|[^\$])\${(?:(?P<section>[a-zA-Z0-9\-_]+?)\:)??(?P<key>[a-zA-Z0-9\-_]+?)(?:\[(?P<index>[0-9]+?)\])??\}

I still had to add a check to remove the last short-lived character. at the end of the example below. For the story, I saved a few iterations that I have done since I posted this question:

    # keep tokens ${[section:][key][\[index\]]}and skip false ones 
    # pattern = r"\$\{((?P<section>.+?)\:)?(?P<key>.+?)(\[(?P<index>\d+?)\])+?\}" 
    # pattern = r'\$\{((?P<section>\S+?)\:)??(?P<key>\S+?)(\[(?P<index>\d+?)\])?\}'
    # pattern = r'\$\{((?P<section>[a-zA-Z0-9\-_]+?)\:)??(?P<key>[a-zA-Z0-9\-_]+?)(\[(?P<index>[0-9]+?)\])??\}'
    pattern = r'(?:^|[^\$])\${(?:(?P<section>[a-zA-Z0-9\-_]+?)\:)??(?P<key>[a-zA-Z0-9\-_]+?)(?:\[(?P<index>[0-9]+?)\])??\}'

    analyser = re.compile(pattern)
    mo = analyser.search(value, 0)
    log.debug(f'got match object: {mo}')
    while not mo is None:
        log.debug(f'in while loop, level={level}')

        if level > MAX_LEVEL:
            raise RecursionError(f"to many recursive call to _substiture_text() while processing '{value}'.")
        else:
            level +=1

        start = mo.start()
        end   = mo.end()
        # re also captured the first non $ sign symbol
        if value[start] != '$': 
            start += 1
0
source

Source: https://habr.com/ru/post/1691521/


All Articles