He used Element#absUrl() for others so that you can get the (supposed) absolute URL <a href> , <img src> , <link href> , <script src> etc. Eg
for (Element link : document.select("a")) { System.out.println(link.absUrl("href")); }
This is very useful if you want to download and / or analyze related resources.
In the second version of parse (), what does it mean "allow relative URLs for absolute URLs that occur before HTML declares the <base href> "? What should I do if the <base href> tag is never found on the page?
Some (poor) websites may declare <link> or <script> with relative URLs before the <base> . Or, if there is no <base> , then only this baseUri will be used to resolve the relative URLs of the entire document.
What is the purpose of absolute URL discovery? Why does Jsoup need to find an absolute URL?
To return the correct URL to Element#absUrl() . This is purely for the convenience of the end user. Jsoup does not need it to successfully parse HTML on its own.
Finally, but most importantly: Is BaseUri the full URL of the HTML page (as indicated in the original documentation) or is it the base URL of the HTML page?
First. If the latter, then the documentation will lie. baseUri should not be confused with <base href> .
source share