Page output <phonomJS script

Digikey has changed his site and now has javascript called onload via message. This killed my former plain Java retriever code. I am trying to use PhantomJS to allow javascript execution before saving HTML / text.

var page = new WebPage(), t, address; var fs = require('fs'); if (phantom.args.length === 0) { console.log('Usage: save.js <some URL>'); phantom.exit(); } else { address = encodeURI(phantom.args[0]); page.open(address, function (status) { if (status !== 'success') { console.log('FAIL to load the address'); } else { f = null; var markup = page.content; console.log(markup); try { f = fs.open('htmlcode.txt', "w"); f.write(markup); f.close(); } catch (e) { console.log(e); } } phantom.exit(); }); } 

This code works with most web pages, but does not work:

http://search.digikey.com/scripts/dksearch/dksus.dll?keywords=S7072-ND

This is my test case. It cannot open the url and then PhantomJS crashes. Using win32 static build 1.3.

Any tips?

Basically, for me, this is wget, which competes with page rendering and scripts that modify the document before saving the file.

+6
source share
1 answer

quick dirty solution ... and yet posted on phantomjs site ... is to use timeout. I modified your code to include a 2 second wait. This allows you to load the page for 2 seconds before dumping the contents into a file. If you need an exact second or the amount of time will vary greatly, this solution probably will not work for you.

 var page = new WebPage(), t, address; var fs = require('fs'); if (phantom.args.length === 0) { console.log('Usage: save.js <some URL>'); phantom.exit(); } else { address = encodeURI(phantom.args[0]); page.open(address, function (status) { if (status !== 'success') { console.log('FAIL to load the address'); } else { window.setTimeout(function(){ f = null; var markup = page.content; console.log(markup); try { f = fs.open('htmlcode.txt', "w"); f.write(markup); f.close(); } catch (e) { console.log(e); } } phantom.exit(); },2000); }); } 
+1
source

Source: https://habr.com/ru/post/904830/


All Articles