HtmlUnit cannot get page after file upload

I am having this weird issue with HtmlUnit in Java. I use it to download some data from a website, the process looks something like this:

1 - Login

2 - for each element (cars)

----- 3 Car Search

----- 4 Download the zip file from the link

Code:

Web client creation:

webClient = new WebClient(BrowserVersion.FIREFOX_3_6); webClient.setJavaScriptEnabled(true); webClient.setThrowExceptionOnScriptError(false); DefaultCredentialsProvider provider = new DefaultCredentialsProvider(); provider.addCredentials(USERNAME, PASSWORD); webClient.setCredentialsProvider(provider); webClient.setRefreshHandler(new ImmediateRefreshHandler()); 

Login:

  public void login() throws IOException { page = (HtmlPage) webClient.getPage(URL); HtmlForm form = page.getFormByName("formLogin"); String user = USERNAME; String password = PASSWORD; // Enter login and password form.getInputByName("LoginSteps$UserName").setValueAttribute(user); form.getInputByName("LoginSteps$Password").setValueAttribute(password); // Click Login Button page = (HtmlPage) form.getInputByName("LoginSteps$LoginButton").click(); webClient.waitForBackgroundJavaScript(3000); // Click on Campa area HtmlAnchor link = (HtmlAnchor) page.getElementById("ctl00_linkCampaNoiH"); page = (HtmlPage) link.click(); webClient.waitForBackgroundJavaScript(3000); System.out.println(page.asText()); } 

Search for a car on the site:

 private void searchCar(String _regNumber) throws IOException { // Open search window page = page.getElementById("search_gridCampaNoi").click(); webClient.waitForBackgroundJavaScript(3000); // Write plate number HtmlInput element = (HtmlInput) page.getElementById("jqg1"); element.setValueAttribute(_regNumber); webClient.waitForBackgroundJavaScript(3000); // Click on search HtmlAnchor anchor = (HtmlAnchor) page.getByXPath("//*[@id=\"fbox_gridCampaNoi_search\"]").get(0); page = anchor.click(); webClient.waitForBackgroundJavaScript(3000); System.out.println(page.asText()); } 

Download pdf:

  try { InputStream is = _link.click().getWebResponse().getContentAsStream(); File path = new File(new File(DOWNLOAD_PATH), _regNumber); if (!path.exists()) { path.mkdir(); } writeToFile(is, new File(path, _regNumber + "_pdfs.zip")); } catch (Exception e) { e.printStackTrace(); } } 

Problem:

The first car is working fine, pdf is loaded, but as soon as I search for a new car, when I get to this line:

 page = page.getElementById("search_gridCampaNoi").click(); 

I get this exception:

 Exception in thread "main" java.lang.ClassCastException: com.gargoylesoftware.htmlunit.UnexpectedPage cannot be cast to com.gargoylesoftware.htmlunit.html.HtmlPage 

After debugging, I realized that the moment I make this call:

 InputStream is = _link.click().getWebResponse().getContentAsStream(); 

return type page.getElementById ("search_gridCampaNoi"). click () changes from HtmlPage to WebResponse, so instead of getting a new page, I again get the file that I already uploaded.

A few screenshots of the debugger showing this situation:

First call, return type OK:

enter image description here

The second call, the return type is changed, and I no longer get the HtmlPage:

enter image description here

Thanks in advance!

+4
source share
1 answer

Just in case someone is facing the same problem, I found a workaround. String replacement:

 InputStream is = _link.click().getWebResponse().getContentAsStream(); 

to

 InputStream is = _link.openLinkInNewWindow().getWebResponse().getContentAsStream(); 

seems to be doing the trick. I am having problems performing several iterations, sometimes it works, sometimes it is not, but at least I have something now.

+8
source

Source: https://habr.com/ru/post/1380168/


All Articles