You can continue to rewind regular expressions along with things like "\ s | \ w \ x (..)" to remove the case with \ x65. Obviously, this will be fragile since there is no guarantee that your sequence \ x65 always has a space or a character in front of it. This may be the beginning of the file. Also, your regex will match \ xTT, which is obviously not unicode. Consider replacing '.' with a character class like "\ x ([0-9a-f] {2})".
If it were a school project, I would do something like the following. You can replace all combinations of "\" with another unlikely sequence, for example, "@ !! @ !! @", run the regular expression and replacements, and then replace the entire unlikely sequence with "\". For instance:
String s = inputString.Replace(@"\\", @" _@ !!@ !!@ _"); // do all of the regex, replacements, etc here String output = s.Replace(@" _@ !!@ !!@ _", @"\");
However, you should not do this in production code, because if your input stream ever has a magic sequence, then you will get additional backslashes.
Obviously, you are writing as if an interpolator. I feel obligated to recommend learning something more solid, like lexers who use regular expressions to create finite machines. The Wiki has great articles on this topic and I am a big fan of ANTLR. Now it may be a reevaluation, but if you continue to face these special cases, consider solving your problem in a more general way.
Start reading here for theory: http://en.wikipedia.org/wiki/Lexical_analysis
user1251108
source share