HTMLUnit does not work with AngularJS

According to https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot using HtmlUnit (2.13) I am trying to create a snapshot for a web page using AngularJS (1.2.1),

My Java code is:

WebClient webClient = new WebClient(); webClient.setAjaxController(new NicelyResynchronizingAjaxController()); webClient.setCssErrorHandler(new SilentCssErrorHandler()); webClient.getOptions().setCssEnabled(true); webClient.getOptions().setRedirectEnabled(false); webClient.getOptions().setAppletEnabled(false); webClient.getOptions().setJavaScriptEnabled(true); webClient.getOptions().setPopupBlockerEnabled(true); webClient.getOptions().setTimeout(10000); webClient.getOptions().setThrowExceptionOnFailingStatusCode(true); webClient.getOptions().setThrowExceptionOnScriptError(true); webClient.getOptions().setPrintContentOnFailingStatusCode(true); HtmlPage page = webClient.getPage(new WebRequest(new URL("..."), HttpMethod.GET)); webClient.waitForBackgroundJavaScript(5000); String result = page.asXml(); 

Although webClient.getPage(...) does not throw any exceptions, the result string still contains "unvalued angular expressions," such as

 <div> {{name}} </div> 

I know http://htmlunit.10904.n7.nabble.com/htmlunit-to-scrape-angularjs-td29931.html#a30075 , but the recommendation given there does not work either.

Of course, the same GET request works without exception in all current browsers.

Any ideas / experience on how to get HtmlUnit to work with AngularJS?

Update:

I created an HTMLUnit error report. At the moment, I have switched my implementation to PhantomJS. Perhaps this piece of code helps others with a similar problem:

 System.setProperty("phantomjs.binary.path", "phantomjs.exe"); DesiredCapabilities caps = new DesiredCapabilities(); caps.setJavascriptEnabled(true); caps.setCapability("takesScreenshot", false); PhantomJSDriver driver = new PhantomJSDriver(caps); driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS); driver.get(new URL("...")); String result = driver.getPageSource(); 

Update2: I stopped rendering my pages manually, as Google crawler now makes angular sites

+6
source share
5 answers

I had the same problem, but I could not use explicit boot testing because angular e2e tests do not work with explicit bootstrap.

I solved the problem using

 <html id="ng-app" class="ng-app: appmodule;"> 

instead

 <html ng-app="appmodule"> 

Work on htmlunit and e2e tests also work.

Most likely, htmlunit does not support (completely?) The document.querySelectorAll () file. This method is used by angularInit () to find ng-app directives.

The syntax for the ng-app directive works around document.querySelectorAll () calls in angularInit ().

+8
source

I had the same problem with "unvalued angular expressions" if I use HtmlUnit. The solution is to manually boot the application . Playback Stages:

A minimal example of an application running in a browser but not with HtmlUnit:

 <!doctype html> <html ng-app> <head> <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js"></script> </head> <body> <div> <label>Name:</label> <input type="text" ng-model="yourName" placeholder="Enter a name here"> <hr> <h1>Hello {{yourName}}!</h1> </div> </body> </html> 

Modification Steps:

And now a working example:

 <!doctype html> <html> <head> <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js"></script> <script> angular.element(document).ready(function() { angular.module('myApp', []); angular.bootstrap(document, ['myApp']); }); </script> </head> <body> <div> <label>Name:</label> <input type="text" ng-model="yourName" placeholder="Enter a name here"> <hr> <h1>Hello {{yourName}}!</h1> </div> </body> </html> 

Test:

 WebClient webClient = new WebClient(); webClient.setAjaxController(new NicelyResynchronizingAjaxController()); HtmlPage page = webClient.getPage("http://localhost:8080/index.html"); // Initial state assertEquals("Hello !", page.getElementsByTagName("h1").get(0).asText()); // Set value ((HtmlInput)page.getElementsByTagName("input").get(0)).setValueAttribute("world"); // New state assertEquals("Hello world!", page.getElementsByTagName("h1").get(0).asText()); 

This is a working solution, but not a very pleasant solution. I don't know if this is a HtmlUnit or Angularjs problem.

+1
source

Now the problem with HtmlUnit is fixed. AngularJS expressions are now evaluated correctly.

https://sourceforge.net/p/htmlunit/bugs/1559/

+1
source

Similar code works fine for me when my single page application uses angularjs 1.0.4; the only thing I needed to do was tell htmlunit to use FIREFOX_17 instead of IE8 by default in htmlunit 2.12 (Same as the link you provided, but FIREFOX_17 instead of FIREFOX_10)

 final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17); 

I upgraded to angularjs 1.2 and appeared on my page with all angular placeholders.

0
source

Thank you for reporting recorded in SVN. Expect HtmlUnit 2.15 very soon.

Now the test case works with the Chrome simulation, the reason is that the querySelectorAll () request must be defined in the document / element.

Please note that others seem to have identified the root cause and provided a minimal test case for the HtmlUnit command, which can be fixed in a very short time.

Thanks again for your feedback.

0
source

Source: https://habr.com/ru/post/958672/


All Articles