Perhaps the wrong XHTML from the Haskell standard library?

I am trying to solve a problem with a program that generates XHTML using Haskell from UTF-8 text. The program accepts lines of this text and should create valid XHTML entities, but this is not the case. I import Text.XHtml.Transitional and use the href and identifier functions to generate URIs and identity attributes from UTF-8 strings. Using the Haskell interpreter, we can see:

Prelude Text.XHtml.Transitional> href "äöü" href="äöü" 

This is a good and valid XHTML URI. Nonetheless,

 Prelude Text.XHtml.Transitional> identifier "äöü" id="äöü" 

is not, according to the specification, which does not allow '&', '#' and ';' characters. So it looks like Text.XHtml.Transitional lib is buggy. Moreover, I think that even XHMTL is bad because it does not give a 1: 1 mapping from UTF-8 in attributes and one that is identical to the mapping used for the URI.

Since I'm new to Haskell, I could make a mistake somewhere. In addition, I know that HTML5 relaxes these attribute restrictions. But that does not dominate. Is the buggy library? If so, which display should replace this one?

See also Invalid Xhtml Characters?

+4
source share
1 answer

Many unic ascii unicode characters are valid in identifiers (see Name production), including your accented letters.

Please note that products are applied after normalization .

i.e. & , # and ; may not appear in the ID, but in your example, they do not appear in the identifier --- äöü identifier. Then it was encoded as äöü supposedly to survive being issued as US-ASCII or ISO-8859-1.

Therefore, I do not think this is a mistake in the library.

+7
source

Source: https://habr.com/ru/post/1382020/


All Articles