Web crawler able to interpret Javascript in python for Windows

My ultimate goal is to create a web crawler capable of loading all the images on a web page. My understanding from the reading I did is that I need to embed a rendering / layout engine like Gecko or Webkit.

Unfortunately, I run windows, so PyWebkit is missing and a short C ++ training for Gecko or Java for using Rhino, I don’t know where to turn.

Is there a reliable python binding rendering engine that will work on Windows (64-bit, Windows 7)? Is there an easy way to execute javascript in a python script on windows?

+4
source share
1 answer

You do not need Webkit for this. All you need is an engine to run Javascript code, so check out Gogole V8 or Mozilla SpiderMonkey .

If you prefer Python to create your crawler, you can use PyV8 as it provides all the necessary bindings.

+3
source

Source: https://habr.com/ru/post/1339755/


All Articles