Handling multiple string types in new C ++ libraries

One thing that has a C ++ - a few lines, or rather, symbols types: char, wchar_t, char16_t, char32_t. As a result, we have different string typedefs: std::string, std::wstring, std::u16stringand std::u32string, which represent different types of lines.

And this does not stop there, if we are talking about Windows and COM, there are also types of platforms, for example BSTRs. And we did not even talk about character encodings.

If you are creating a new library, and one of the requirements was to support all these string types or character types, how would you do it? Forget character encoding for now.

I thought about it, and I came out with several options, but none of them are perfect. Suppose you have a class registry_keythat must support all these types of characters, and part of its OM is more or less (only part of it is illustrated here):

class registry_key
{
public:
  registry_key(unspecified_string_type keyname);
  unspecified_string_type name() const; 
  unspecified_string_type path() const; 
} 

And you would use it like:

registry_key key("HKLM\\Software\\Adobe");
std::string name = key.name();

But it must support other types of strings. In addition, there is no requirement that dictates that the whole registry_keymust be consistent with respect to the types of characters or work with one type of character. You can call the constructor and pass const char*, but get the key name as u16string. This is a reflection of the platform below it, which allows you to call the wide ( XxxW) and narrow ( XxxA) apis inside the same set of api. And this behavior is desirable.

( , ) , . , , , .

, :

1) , basic_string stl. ,

wregistry_key key(L"HKLM\\Software\\Adobe");
std::wstring name = key.name();

u8registry_key key(u8"HKLM\\Software\\Adobe");
std::u16string name = key.name();

, , , , -, . , , .

2) , u16string u32string. , , .

3) :

registry_key key("HKLM\\Software\\Adobe");
std::string name = key.name();
std::wstring name = key.wname();
std::u16string name = key.u8name();
std::u32string name = key.uname();

, redundand.

4) , . , , , . , .

platform_string str = L"foo";
std::string sstr = str;
std::wstring swstr = str;
std::u16string su16str = str;
str = u"foo";

, :

class registry_key
{
public:
  registry_key(unspecified_string_type keyname);
  platform_string name() const; 
  platform_string path() const; 
} 

:

registry_key key("HKLM\\Software\\Adobe");
std::string name = key.name();
std::wstring name = key.name();
std::u16string name = key.name();

- - , . .

, 3) 4)? ?

+4
3

, . C, C, - ++. , .

.

.

+4

, , ?

.

std:: codecvt boost:: nowide, .

, , UTF-32 ( ) UTF-8. UTF-16 ( char16_t, not wchar_t) , , Windows, ( , ).

wstring wchar_t , , - , wchar_t . (, char, char16_t char32_t).

(3) API... , !

(4) , , .

, . (2) .

, http://utf8everywhere.org/ .

+4

, , boost.filesystem, V3. , , , , , .

+1

Source: https://habr.com/ru/post/1540425/


All Articles