C # - Better than looking to clear a PDF file from a domain using Javascript

I am currently creating a small watin-based application that visits a website and then launches a series of URLs to download PDF files using Watin.

The website uses a lot of javascript to load pdf into the embedded HTML.

The program works fine at the moment, but is very slow since watin does not handle downloads very efficiently (it uses the Firefox download system and enters a slow file name before saving.

I would like to know if there is a better framework for web scrap that can provide the same support for Ajax sites, but a better / faster way to upload files.

I was everywhere on the Internet and found about selenium, but it does not prove to be more effective than watching file downloads.

Thanks in advance for your help.

+4
source share
1 answer

You can write the Google Chrome extension using these two APIs as the main engine:

https://developer.chrome.com/extensions/webRequest.html find out when and how to authenticate, and when to start downloading and:

https://developer.chrome.com/extensions/downloads.html to start downloading the file.

Regardless of what is missing in these two APIs to achieve your goal, you can compensate for the user-generated script content - javascript, which is entered on the page opened by the extension - and, for example, connect to jquery.ready to initialize the curettage.

This will certainly be faster than Watin, since the entry for watin is an abstraction layer that is more than talking to the browser.

0
source

Source: https://habr.com/ru/post/1439497/


All Articles