I have an application with an address bar, and users enter an IRI that I have to connect to.
On unix / Darwin, it's simple: I am smoothing IDNs to URIs as described in RFC3987. That is, if the circuit has a privilege section, I map this to ASCII with punycode, and then encode the percentages of any non-ASCII characters in the rest of the IRI.
There are two possibilities in Windows: either the domain name is a regular Internet resource, in which case it must be mapped to ASCII with punycode and look for a normal DNS. Or the domain name is a strange Windows server (for example, an Active Directory DNS server), and the search should be UTF-8.
Examples
- User types
http://โ.net : call getaddrinfo(service="xn--n3h.net") . - User types
http://dryden.internal.corp.com : calling getaddrinfo(service="dryden.internal.corp.com") will work getaddrinfo(service="dryden.internal.corp.com") fine. - User types
http://pรถp.internal.corp.com :- calling
getaddrinfo(service="xn--pp-fka.internal.corp.com") does not work if "pรถp" is the machine name published by DNS UTF-8. - a call to
GetAddrInfoW(service=T"pรถp.internal.corp.com") works.
Firefox and Chrome simultaneously run directly on any IRI, so they cannot resolve strange Microsoft domains.
Rules?
What guidelines exist for processing IRI in such an environment? Are there any recommended ways to guess which kind of DNS lookup should be performed, punycode or UTF-8 DNS? What do other apps do?
My best attempt at a solution is to first punycode if it is a public TLD, but skip the punycode attempt if TLD is internal (acme.com can serve public things, acme.ltd is probably an intranet). If punycode failed or was skipped, I try to execute a UTF-8 request.
source share