Follow javascript link with mechanize and python

Question

Follow javascript link with mechanize and python

I do some web scraping, and the project is almost complete, except that I need to click the javascript link and I can’t figure out how to mechanize with Python.

A list of javascript links appears on one of the pages, and I want to follow them one by one, clear some data and repeat. I know mechanization doesn't work with javascript, but does anyone know a workaround? Here is the code I use to highlight links:

for Auth in iterAuths: Auth = str(Auth.contents[0]).strip() br.find_link(text=Auth)

now, if I do br.follow_link(text=Auth) , I get the error urllib2.URLError: <urlopen error unknown url type: javascript> .

If I do print br.click_link(text=Auth') , it prints as Request for javascript:SendThePage('5660')

I just need to go through the javascript link. Can anyone help?

+4

javascript python web-scraping mechanize

chasmani 15 sept. '13 at 7:25

source share

1 answer

blakev · Accepted Answer · 2013-09-15T07:36:10+0000

When I needed to do something similar, I looked at the links that I was trying to execute.

Some of them were static links generated using javascript. They were predictable / consistent enough so that I could manually create a list before the distribution.

Others have just been constructed with parameter urls. They can also be parsed before processing and generated on the python side and passed as a request instead of "click on this link."

If you need to run javascript, you can run the PyV8 + Mechanize hybrid. I play with it a bit and it seems pretty cool. PyV8 connects Python with the Javascript V8 engine, which allows you to create JS environments and execute arbitrary code. This is a great job going back and forth between two languages.

I don't have a sample code, but one of these 3 solutions will work for you :) Good luck!

Follow javascript link with mechanize and python

More articles: