I am trying to download some data from a website using Python. If you just copy and paste the URL, it will not show anything unless you fill in the registration information. I have a username and password, but how do I enable them in Python?
My current code is:
import urllib, urllib2, cookielib
username = my_user_name
password = my_pwd
link = 'www.google.com'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'j_password' : password})
opener.open(link, login_data)
resp = opener.open(link,login_data)
print resp.read()
The error does not appear, however resp.read () is a bunch of CSS, and it only has messages like "you need to log in before reading the news here."
So, how can I get the page that is after login?
I just noticed that the site requires 3 entries:
Company:
Username:
Password:
I have everything, but how can I put all three in an input variable?
If I run it without logging in, it will return:
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.open(dd)
resp = opener.open(dd)
print resp.read()
Here are the prints:
<DIV id=header>
<DIV id=strapline>
<P><FONT color=#000000>All third party users of this website and/or data produced by the Baltic do so at their own risk. The Baltic owes no duty of care or any other obligation to any party other than the contractual obligations which it owes to its direct contractual partners. </FONT></P><IMG src="images/top-strap.gif"> </DIV>
<DIV id=memberNav>
<FORM class=members id=form1 name=form1 action=client_login/client_authorise.asp?action=login method=post onsubmits="return check()">