No title, Firefox / Webkit script on Linux?

I am looking to automate some web interactions, namely periodically downloading files from a secure website. This basically involves entering my username / password and navigating to the corresponding URL.

I tried simple scripts in Python, and then more complex scripts, but only to find that this particular site uses some kind of nasty javascript and flash mechanism for logging in, which makes my methods useless.

Then I tried HTMLUnit, but it doesn't seem to want to work either. I suspect that using Flash is a problem.

I don’t want to think about it anymore, so I tend to write an actual browser to log in and capture the file I need.

Requirements:

  • Running on a Linux server (i.e. without starting X). If I really need to have X, I can do it, but I will not be happy.
  • Be reliable. I want to start this and never think about it again.
  • Be a script. Nothing too complicated, but I should be able to tell the browser about the various steps and pages to visit.

Are there any good tools for a headless, X-less script browser? Have you tried something like this, and if you have any words of wisdom?

+44
firefox screen-scraping webkit headless-browser
Jan 15 '10 at 17:20
source share
7 answers

I contacted IE with a built-in browser (although it was a GUI application with a hidden panel of browser components). In fact, you can take any layout mechanism and cut the output logic. Navigation should be done by running script-like events.

You can use crowbar . This is a headless version of firefox (Gecko engine). It turns the browser into a RESTful server that can accept requests ("fetch url"). Thus, it parses html, presents it as a DOM, waits for a certain delay for all scripts that are executed.

It works on linux. I suppose you can easily expand it for your purpose using JS and XULrunner’s rich abilities.

+17
May 31 '10 at 15:30
source share

What about phantomjs ?

+38
Feb 24 '11 at 11:56
source share

Have you tried Selenium ? This will allow you to record a usage scenario using the extension for Firefox, which can later be played using several different methods.

Edit: I just realized that it was a very late answer. :)

+8
Mar 08 2018-11-11T00:
source share

Take a look at WebKitDriver . The project includes a silent implementation of WebKit.

+6
May 16 '11 at 5:56 a.m.
source share

I don't know how to do flash interactions (and also interesting), but for html / javascript you can use Chickenfoot .

And to get a headless + scripting browser running on Linux, you can use the Qt webkit library . Here is a usage example .

+1
Jan 30
source share

To do this, I just write the Chrome extensions that are published in CouchDBs ( example and futon ). Add Couch to the permissions in the manifest to allow cross-domain XHRs.

(I came to this topic in search of a headless alternative to what I was doing, finding this thread, I will try to try Crowbar at some point.)

Also, given the bizarre features of this website, I can't help but wonder if you can use some security holes to get around Flash and Javascript.

0
Nov 02 2018-11-11T00:
source share

iMacros for Linux allows Firefox and Chrome script: http://wiki.imacros.net/Linux

-one
May 18 '11 at 14:25
source share



All Articles