I have a .NET plugin that needs to get the text of the current buffer. I found this page that shows a way to do this:
public static string GetDocumentText(IntPtr curScintilla) { int length = (int)Win32.SendMessage(curScintilla, SciMsg.SCI_GETLENGTH, 0, 0) + 1; StringBuilder sb = new StringBuilder(length); Win32.SendMessage(curScintilla, SciMsg.SCI_GETTEXT, length, sb); return sb.ToString(); }
And that’s fine, until we reach the problems with character encoding. I have a buffer that is set in the "Encoding" menu to "UTF-8 without specification", and I write this text to a file:
System.IO.File.WriteAllText(@"C:\Users\davet\BBBBBB.txt", sb.ToString());
when I open this file (in Notepad ++), UTF-8 without specification is displayed in the encoding menu, but the ß character is broken (ß).
I managed to find the encoding information of my current buffer:
int currentBuffer = (int)Win32.SendMessage(PluginBase.nppData._nppHandle, NppMsg.NPPM_GETCURRENTBUFFERID, 0, 0); Console.WriteLine("currentBuffer: " + currentBuffer); int encoding = (int) Win32.SendMessage(PluginBase.nppData._nppHandle, NppMsg.NPPM_GETBUFFERENCODING, currentBuffer, 0); Console.WriteLine("encoding = " + encoding);
And it shows “4” for “UTF-8 without specification” and “0” for “ASCII”, but I cannot find that notepad ++ or Scintilla thinks these values should represent.
So, I lost a little place where to go next (Windows is not my natural habitat). Does anyone know that I'm wrong, or how to debug it further?
Thanks.