I have a class that I use to "split" the SQL command line using a package separator - for example. "GO" - to the list of SQL commands that are run in turn, etc.
... private static IEnumerable<string> SplitByBatchIndecator(string script, string batchIndicator) { string pattern = string.Concat("^\\s*", batchIndicator, "\\s*$"); RegexOptions options = RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline; foreach (string batch in Regex.Split(script, pattern, options)) { yield return batch.Trim(); } }
My current implementation uses Regex
with yield
, but I'm not sure if this is the best way.
- It should be fast
- It should handle large lines (for example, I have some 10 MB scripts)
- The hardest part (which is not currently running) to take into account the quoted text
Currently, the following SQL will be broken incorrectly:
var batch = QueryBatch.Parse(@"-- issue... insert into table (name, desc) values('foo', 'if the go is on a line by itself we have a problem...')"); Assert.That(batch.Queries.Count, Is.EqualTo(1), "This fails for now...");
I thought of a token-based parser that monitors the state of open closed quotes, but I'm not sure Regex will do this.
Any ideas !?
source share