Parameterized Regular Expression in Python

In Python, there is a better way for strings to parameterize strings in regular expressions than to do it manually:

test = 'flobalob'
names = ['a', 'b', 'c']
for name in names:
    regexp = "%s" % (name)
    print regexp, re.search(regexp, test)

This noddy example tries to match each name in turn. I know that it’s better to do this, but this is a simple example to illustrate this point.


The answer seems no, there is no real alternative. The best way to parameterize regular expressions in python is as above, or with derivatives like str.format(). I tried to write a general question, not "fix ma codez, thank you". For those who are interested, I gave an example that is closer to my needs:

for diskfilename in os.listdir(''):
    filenames = ['bob.txt', 'fred.txt', 'paul.txt']
    for filename in filenames:
        name, ext = filename.split('.')
        regexp = "%s.*\.%s" % (name, ext)
        m = re.search(regexp, diskfilename)
        if m:
          print diskfilename, regexp, re.search(regexp, diskfilename)
          # ...

"" <filename>_<date>.<extension>. filenames - dict, , .

, :

  • . - , . ( ).

  • . , .bak .. - , , , .


, . . fnmatch, .

+3
3

, , . :

d = {'bar': 'a', 'foo': 'b'}
regexp = '%(foo)s|%(bar)s' % d

, , :

vlist = ['a', 'b', 'c']
regexp = '|'.join([s for s in vlist])

EDIT: , , .

, , :

filename = 'bob_20090216.txt'

regexps = {'bob': 'bob_[0-9]+.txt',
           'fred': 'fred_[0-9]+.txt',
           'paul': 'paul_[0-9]+.txt'}

for filetype, regexp in regexps.items():
    m = re.match(regexp, filename)
    if m != None:
        print '%s is of type %s' % (filename, filetype)
+6
import fnmatch, os

filenames = ['bob.txt', 'fred.txt', 'paul.txt']

                  # 'b.txt.b' -> 'b.txt*.b'
filepatterns = ((f, '*'.join(os.path.splitext(f))) for f in filenames) 
diskfilenames = filter(os.path.isfile, os.listdir(''))
pattern2filenames = dict((fn, fnmatch.filter(diskfilenames, pat))
                         for fn, pat in filepatterns)

print pattern2filenames

:

{'bob.txt': ['bob20090217.txt'], 'paul.txt': [], 'fred.txt': []}

:


, filename.startswith(prefix) .

, , .


  • re.escape(name), name.

  • , . :

    import string
    print string.Template("$a $b").substitute(a=1, b="B")
    # 1 B
    

    str.format() Python 2.6 +:

    print "{0.imag}".format(1j+2)
    # 1.0
    
+2

maybe glob and fnmatch can help you?

+2
source

Source: https://habr.com/ru/post/1703314/


All Articles