If you are booting from an external source, you can check the MIME file type and see if it exists application/xhtml+xml; if so, then it is definitely XHTML (of course, it can lie and serve with this type, but with terribly distorted markup). Otherwise, if it is text/html, then it will be parsed as an HTML tag soup. The validity of the actual markup aside, the doctype declaration is your best best way to find out if the content (or claims to be content) is HTML or XHTML.
As you say, you can check the public identifier and / or URI and determine the type from there.
source
share