Efficient way to search string using streamreader

Question

Efficient way to search string using streamreader

I get a web response and use streamreader to get the response as a string

my code

HttpWebResponse response = (HttpWebResponse) request.GetResponse(); StreamReader reader = new StreamReader(response.GetResponseStream()); string strResponse = reader.ReadToEnd();

Line example

 <div class="box-round"> <ol style="list-style-type: decimal;list-style-position:outside;margin-left:42px;"> <li>Order ID #A123456 already exists: Update performed </ol> </div>

or

 <div class="box-round"> <ol style="list-style-type: decimal;list-style-position:outside;margin-left:42px;"> <li>New order created </ol> </div>

I want to find the next line in a line

 Order ID #A123456 already exists: Update performed

or

 New order created

This is the best way to search for lines (lines)

  while (!reader.EndOfStream) { line = reader.ReadLine(); if (!string.IsNullOrEmpty(line)) { } }

+6

c #

Capslock Aug 4 '11 at 23:57

source share

3 answers

Jon skeet · Answer 1 · 2011-08-05T00:09:55+0000

Ok, personally I would use:

 string line; while ((line = reader.ReadLine()) != null) { if (line.Contains(...)) { } }

Reading a line gives you data and tells you if you have reached the end of the stream. I agree with Jeff - parsing HTML by reading it line by line is usually a bad idea. Of course, this can be quite good in your specific situation.

Benh · Answer 2 · 2011-08-05T00:47:13+0000

Here's how to do it with regex, regex isn't the best method, but if it's one time working with the html parser, it's probably more than what you are trading for

 Match myMatch = Regex.Match(input, @"<div class=""box-round"">.*?<li>(.*?)</ol>", Regex.Singleline); if (myMatch.Success) { }

Ben · Answer 3 · 2011-08-05T00:17:33+0000

It really depends - do you need to know where the DOM is in your specific text? How big is the entrance? Will your line ever be split between two lines?

If you are only concerned about the presence of text, and your input is small enough to be in memory, I would just read it all in memory. I'm not sure which algorithm the CLR uses to match strings, but some of the faster routines include preprocessing both the query and the search string, and the presence of additional information for the preprocessing could potentially lead to a faster search.

Of course, all this depends on the internal properties of the CLR and your specific requirements - test, test, test.

If you want more information about your text and its relation to the surrounding document, I would suggest looking at HtmlAgility to analyze your document.

Efficient way to search string using streamreader

More articles: