I am working on a grep-like utility in Python to search for Oracle source files. Encoding standards have changed over time, so trying to find something like “all deletions from the a.foo table” may span multiple lines or not, depending on the age of this piece of code:
s = """-- multiline DDL statement
DELETE
a.foo f
WHERE
f.bar = 'XYZ';
DELETE a.foo f
WHERE f.bar = 'ABC';
DELETE a.foo WHERE bar = 'PDQ';
"""
import re
p = re.compile( r'\bDELETE\b.+?a\.foo', re.MULTILINE | re.DOTALL )
for m in re.finditer( p, s ):
print s[ m.start() : m.end() ]
It is output:
DELETE
a.foo
DELETE a.foo
DELETE a.foo
What I want:
[2] DELETE
[3] a.foo
[7] DELETE a.foo
[10] DELETE a.foo
Is there a quick / easy / built-in way to match string indexes with line numbers?
source
share