I am looking to automate some web interactions, namely periodically downloading files from a secure website. This basically involves entering my username / password and navigating to the corresponding URL.
I tried simple scripts in Python, and then more complex scripts, but only to find that this particular site uses some kind of nasty javascript and flash mechanism for logging in, which makes my methods useless.
Then I tried HTMLUnit, but it doesn't seem to want to work either. I suspect that using Flash is a problem.
I don’t want to think about it anymore, so I tend to write an actual browser to log in and capture the file I need.
Requirements:
- Running on a Linux server (i.e. without starting X). If I really need to have X, I can do it, but I will not be happy.
- Be reliable. I want to start this and never think about it again.
- Be a script. Nothing too complicated, but I should be able to tell the browser about the various steps and pages to visit.
Are there any good tools for a headless, X-less script browser? Have you tried something like this, and if you have any words of wisdom?
firefox screen-scraping webkit headless-browser
Parand Jan 15 '10 at 17:20 2010-01-15 17:20
source share