How can I parse sql file from Python?

Is there a way to execute some SQL commands in a .sql file from Python, but not all SQL commands in a file? Suppose I have the following .sql file:

 DROP TABLE IF EXISTS `tableA`; CREATE TABLE `tableA`( some_code ) ENGINE=MyISAM DEFAULT CHARSET=latin1; DROP TABLE IF EXISTS `tableB`; CREATE TABLE `tableB`( some_code ) ENGINE=MyISAM DEFAULT CHARSET=latin1; DROP TABLE IF EXISTS `tableC`; CREATE TABLE `tableC`( some_code ) ENGINE=MyISAM DEFAULT CHARSET=latin1; ...to be continued... 

In this file I want to parse and run only the tableB related command (i.e. drop and create tableB ), but I do not like to execute any SQL commands on other tables from Python. I have some knowledge on how to execute a .sql file from Python, but don't know how to execute only some specific commands in a .sql file, as indicated in the above example. The first thing that catches your eye is to use a regular expression. But after a bit of controversy, I was unable to come up with the correct regular expression syntax to get from me what I expected because of my poor knowledge and experience in regular expression.

So my question is:

1) Is it right to use a regular expression to get only the desired commands, and if so, can you show me the correct syntax for parsing it?

2) If the regular expression is not the best here, then what is the alternative solution?

3) I found several online tools for testing regular expressions, but all of them should indicate both the expression and test strings and highlight consistent data in a string. I find it great if there are some tools that first change the test lines and then select the necessary data inside the line manually, and then return some appropriate syntax / expression unfavorably. If you know such tools (no restrictions on online tools! I am also glad if this application is for Macintosh), please tell me ...

Thanks.

+4
source share
3 answers

While regex can't be the right tool, you can still use it.

 >>> statements = """ ... DROP TABLE IF EXISTS `tableA`; ... ... CREATE TABLE `tableA`( ... some_code ... ) ENGINE=MyISAM DEFAULT CHARSET=latin1; ... ... DROP TABLE IF EXISTS `tableB`; ... ... CREATE TABLE `tableB`( ... some_code ... ) ENGINE=MyISAM DEFAULT CHARSET=latin1; ... ... DROP TABLE IF EXISTS `tableC`; ... ... CREATE TABLE `tableC`( ... some_code ... ) ENGINE=MyISAM DEFAULT CHARSET=latin1; ... """ >>> regex = r"((?:CREATE|DROP) TABLE (?:IF (?:NOT )?EXISTS )?`tableB`(?:[^;]|(?:'.*?'))*;)" >>> re.findall(regex, statements, re.I) ['DROP TABLE IF EXISTS `tableB`;', 'CREATE TABLE `tableB`(\nsome_code\n) ENGINE=MyISAM DEFAULT CHARSET=latin1;'] >>> 

If you are interested in what

 `(?:[^;]|(?:'.*?'))*` 

for, it is simply used to match any character except ; , any number of times, including none

or

string literal, that is, it will allow ; match within a string, for example 'this is a ;value; for a varchar field' 'this is a ;value; for a varchar field' .

0
source

You can try the sqlparse library, which will facilitate your work by analyzing SQL statements and give you the ability to query and work with tokens in an SQL expression. It could be a goos base for filtering statements containing a specific token, like tableB in your case

+3
source

Although I personally believe that you should use some parsing library to analyze AST for SQL, looking at the code also makes this option viable:

 my_sql_code = '''DROP TABLE...''' #big long string, multiline statements = my_sql_code.split(';') statements = [s for s in statements if 'tableB' in s] for s in statements: execute_sql(s) 
0
source

Source: https://habr.com/ru/post/1482953/


All Articles