Clear a webpage that requires them to open a session cookie first

I am trying to clear an excel file from a government cast database. However, the url I should get in this excel file is:

http://nrega.ap.gov.in/Nregs/FrontServlet?requestType=HouseholdInf_engRH&hhid=192420317026010002&actionVal=musterrolls&type=Normal

requires that I have a session cookie from a government site attached to the request.

How can I grab a session cookie with an initial request to the landing page (when they give you a session cookie) and then use it to get to the URL above to capture our excel file? I am using the Google App Engine using Python.

I tried this:

import urllib2 import cookielib url = 'http://nrega.ap.gov.in/Nregs/FrontServlet?requestType=HouseholdInf_engRH&hhid=192420317026010002&actionVal=musterrolls&type=Normal' def grab_data_with_cookie(cookie_jar, url): opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie_jar)) data = opener.open(url) return data cj = cookielib.CookieJar() #grab the data data1 = grab_data_with_cookie(cj, url) #the second time we do this, we get back the excel sheet. data2 = grab_data_with_cookie(cj, url) stuff2 = data2.read() 

I am sure this is not the best way to do this. How can I do this more cleanly or even using a query library?

+4
source share
2 answers

Using requests , this is a trivial task:

 >>> url = 'http://httpbin.org/cookies/set/requests-is/awesome' >>> r = requests.get(url) >>> print r.cookies {'requests-is': 'awesome'} 
+10
source

Use of cookies and urllib2 :

 import cookielib import urllib2 cj = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) # use opener to open different urls 

You can use the same opener for multiple connections:

 data = [opener.open(url).read() for url in urls] 

Or install it globally:

 urllib2.install_opener(opener) 

In the latter case, the rest of the code looks the same with or without cookie support:

 data = [urllib2.urlopen(url).read() for url in urls] 
+3
source

Source: https://habr.com/ru/post/1258377/


All Articles