In what context are you doing this?
string example_text = "<em>Ich bin ein Bärliner</em>"; Regex em = new Regex(@"<em>[^<]*</em>" ); Match emMatch = em.Match(example_text); while (emMatch.Success) { Console.WriteLine(emMatch.Value); emMatch = emMatch.NextMatch(); }
This displays <em>Ich bin ein Bärliner</em>
in my console
Probably the problem is not that you are returning the wrong value, but that you are getting a representation of the value that is not displayed correctly. It can depend on many things. Try writing the value to a text file using UTF8 encoding and see if it is fixed.
Edit: Right. The fact is that you get text from WinForms RichTextBox
using the Rtf
property. This will not return the text as is, but will return an RTF representation of the text. RTF is not ordinary text; it is a markup format for displaying extended text. If you open an RTF document, for example. In notepad you will see that it has a lot of strange codes, including \'e4
for each' ä 'in your RTF document. If you used some markup (e.g. bold text, color, etc.) in the RTF field, the .Rtf
property .Rtf
also return this code, looking something like {\rtlch\fcs1 \af31507 \ltrch\fcs0 \cf6\insrsid15946317\charrsid15946317 test}
Therefore, use the .Text
property. It will return the actual text.
source share