Acceleration Analysis Algorithm

Question

Acceleration Analysis Algorithm

I am trying to parse some ddump files, could you help me speed up my algorithm?
Each cycle requires 216 ms! it's too much. I would like to have it about 40-50 ms per cycle. Maybe using RegExp?

Here is my algrithm:

  while (pos <EntireFile.Length && (/ * curr = * / EntireFile.Substring (pos, EntireFile.Length - pos)). Contains ("class")) {w.Reset ();  w.Start ();  pos = EntireFile.ToLower (). IndexOf ("class", pos) + 6;  int end11 = EntireFile.ToLower (). IndexOf ("extends", pos);  if (end11 == -1) end11 = EntireFile.IndexOf ("\ r \ n", pos);  else {int end22 = EntireFile.IndexOf ("\ r \ n", pos);  if (end22 <end11) end11 = end22;  } // string opcods = EntireFile.Substring (pos, EntireFile.Length - pos);  string Cname = EntireFile.Substring (pos, end11 - pos) .Trim ();  pos + = (end11 - pos) + 7;  pos = EntireFile.IndexOf ("{", pos) +1; int count = 1; string searching = EntireFile.Substring(pos, EntireFile.Length - pos); int searched = 0; while (count != 0) { if (searching[searched] == '{') count++; else if (searching[searched] == '}') count--; searched++; } string Content = EntireFile.Substring(pos, searched); tlist.Add(new TClass() { ClassName = Cname, Content = Content }); pos += searched; if (pos % 3 == 0) { double prc = ((double)pos) * 100d / ((double)EntireFile.Length); int prcc = (int)Math.Round(prc); wnd.UpdateStatus(prcc); wnd.Update(); } mils.Add((int)w.ElapsedMilliseconds); }

Any help would be greatly appreciated.

+4

c # algorithm parsing refactoring

alex Mar 10 '11 at 15:14

source share

5 answers

You have performance issues related to the overhead of all row copy operations.

There are overloads that allow you to specify a valid range of your string operations if you eliminate copying simply by using an index to actually fine-tune the entire string that matters.

In addition, case insensitivity is not performed by decreasing or increasing the string! You are using the StringComparer or StringComparsion enumeration. There are many line overloads that allow you to specify whether case sensitivity should be considered.

Indexing a string using square brackets is also very expensive. If you look at the implementation of string operations in .NET, they always turn the search string into a char array, because it works faster. However, this means that many copies still occur even for read-only searches.

+1

John leidegren Mar 10 '11 at 15:24

source share

I would recommend using a profiling tool for a null value on the part of your code that slows you down.

JetBrains dotTrace is one profiling product that has helped a lot with this task.

+1

code4life Mar 10 '11 at 15:25

source share

In addition to the answer from John, as I understand it, anything in your while () part of your code will be executed in each loop. Thus, it may be easier for you to figure out a way so that it does not count

 EntireFile.Substring(pos, EntireFile.Length - pos)).Contains(" class")

at each iteration of the while loop. Also, what exactly are you trying to make out? Is this a plain text file? You did not provide a lot of details. One of the methods that I would like to use to parse text files is to load the entire file into an array of strings, using '\ n' as a delimiter. Then I can quickly go through the array and parse the contents. If I need, I can store the index of the array and quickly refer to the previous line.

+1

Davido Mar 10 '11 at 15:29

source share

firstly you can change

 while (pos < EntireFile.Length && (/*curr = */EntireFile.Substring(pos, EntireFile.Length - pos)).Contains(" class")) { ... }

to

 var loweredEntireFile = EntireFile.ToLower(); while (pos < loweredEntireFile.Length && Regex.IsMatch(loweredEntireFile, " class", RegexOptions.IgnoreCase) { ... // we just need to process the rest of the file loweredEntireFile = loweredEntireFile.Substring(pos, loweredEntireFile.Length - pos)); }

then change

 pos = EntireFile.ToLower().IndexOf(" class", pos) + 6; int end11 = EntireFile.ToLower().IndexOf("extends", pos);

to

 var matches = Regex.Matchs(loweredEntireFile, " class", RegexOptions.IgnoreCase); pos = matches.First().Index; matches = Regex.Matchs(loweredEntireFile, "extends", RegexOptions.IgnoreCase); var end11 = matches.First().Index;

as expected

 var loweredEntiredFile = EntiredFile.ToLower();

should be executed once outside the while, and

 loweredEntireFile = loweredEntireFile.Substring(pos, loweredEntireFile.Length - pos));

must be done at the end of while

+1

Ethan li Apr 6 '13 at 23:34

source share

Jon · Accepted Answer · 2011-03-10T15:19:49+0000

Ok by doing this several times

 EntireFile.ToLower()

certainly will not help. There are a few things you can do:

Perform expensive operations ( ToLower , IndexOf , etc.) only once and cache the results, if possible.
Do not narrow down on the input processed by SubString , this will kill your performance. Rather, keep a separate int parseStart and use this as an extra parameter for all your IndexOf calls. In other words, keep track of the part of the file that you parsed manually, instead of taking a smaller substring each time.

Acceleration Analysis Algorithm

More articles: