I am working on the implementation of various APIs in C and C ++ and wondering what methods are available in order to avoid the fact that clients mistakenly receive the encoding by receiving strings from the framework or passing them back. For example, imagine a simple C ++ plugin API that clients can implement to influence translations. It may have a function like this:
const char *getTranslatedWord( const char *englishWord );
Now let's say that I would like to ensure that all strings are passed as UTF-8. Of course, I would document this requirement, but I would like the compiler to provide the correct encoding, possibly using special types. For example, something like this:
class Word { public: static Word fromUtf8( const char *data ) { return Word( data ); } const char *toUtf8() { return m_data; } private: Word( const char *data ) : m_data( data ) { } const char *m_data; };
Now I can use this specialized type in the API:
Word getTranslatedWord( const Word &englishWord );
Unfortunately, it is easy to make this very inefficient. There are no proper copy constructors, assignment operators, etc. in the Word class, and I would like to avoid unnecessary data copying as much as possible. In addition, I see the danger that Word will expand with a variety of utility functions (e.g. length or fromLatin1 or substr , etc.), and I would prefer not to write another class of strings. I just want a small container that avoids accidental mixing coding.
I wonder if anyone has any other experience with this and can share some useful methods.
EDIT: In my particular case, the API is used on Windows and Linux using MSVC 6 - MSVC 10 for Windows and gcc 3 and 4 on Linux.
source share