I tried the following regex to separate the data in a text file, but during testing I found a strange error - a fairly simple file was skipped clearly incorrectly. Sample code to illustrate this behavior:
const string line = "511525,3122,9,39,2007,9,39,3127,9,39,\" -49,368.11 \",\"-32,724.16\",2,1,\" 2,347.91 \", - ,\" 2,234.17 \", - ,2.2,1.143,2,1.24,FALSE,1,2,0,311,511625"; const string pattern = ",(?=([^\"]*\"[^\"]*\")*[^\"]*$)"; Console.WriteLine(); Console.WriteLine("SPLIT"); var splitted = Regex.Split(line, pattern, RegexOptions.Compiled); foreach (var s in splitted) { Console.WriteLine(s); } Console.WriteLine(); Console.WriteLine("REPLACE"); var replaced = Regex.Replace(line, pattern, "!" , RegexOptions.Compiled); Console.WriteLine(replaced); Console.WriteLine(); Console.WriteLine("MATCH"); var matches = Regex.Matches(line, pattern); foreach (Match match in matches) { Console.WriteLine(match.Index); }
So, as you can see, split is the only method that gives unexpected results (it breaks into invalid positions!)! Both Matches and Replace give absolutely correct results. I even tried checking the specified regex in RegexBuddy and it showed the same matches as Regex.Matches ! Am I missing something or is it like an error in the Split method?
Console exit :
SPLIT 511525 , - ," 2,234.17 " 3122 , - ," 2,234.17 " 9 , - ," 2,234.17 " 39 , - ," 2,234.17 " 2007 , - ," 2,234.17 " 9 , - ," 2,234.17 " 39 , - ," 2,234.17 " 3127 , - ," 2,234.17 " 9 , - ," 2,234.17 " 39 , - ," 2,234.17 " " -49,368.11 " , - ," 2,234.17 " "-32,724.16" , - ," 2,234.17 " 2 , - ," 2,234.17 " 1 , - ," 2,234.17 " " 2,347.91 " - ," 2,234.17 " - " 2,234.17 " " 2,234.17 " - 2.2 1.143 2 1.24 FALSE 1 2 0 311 511625 REPLACE 511525!3122!9!39!2007!9!39!3127!9!39!" -49,368.11 "!"-32,724.16"!2!1!" 2,347.91 "! - !" 2,234.17 "! - !2.2!1.143!2!1.24!FALSE!1!2!0!311!511625 MATCH 6 11 13 16 21 23 26 31 33 36 51 64 66 68 81 87 100 106 110 116 118 123 129 131 133 135 139