Normalizing (webdav) unicode paths

I am working on a WebDAV implementation for PHP . To simplify the work of Windows and other operating systems, I need to skip using some character encoding codes.

Windows uses ISO-8859-1 in its HTTP request, while most other clients encode anything outside ascii as UTF-8.

My first approach was to completely ignore this, but I quickly ran into problems when returning the urls. Then I realized that it’s best to normalize all URLs.

Using ΓΌ as an example. This will be transmitted over the OS / X cable as

u%CC%88 (this is codepoint U+0308)

Windows perceives this as:

%FC (latin1)

But, by running utf8_encode on% FC, I get:

%C3%BC (this is codepoint U+00FC)

% C3% BC u% CC% 88 ? .. ? , , . - , , ( - ).

.

+3
2

, .

. . , os . Windows , .

, , utf8, UTF-8.

+1

Mac unicode "", "u" + Β¨ (diareis) "ΓΌ". Normalizer . Normalizer, iconv('UTF8-MAC', 'UTF8', $str)

+1

Source: https://habr.com/ru/post/1738644/


All Articles