I am new to Python and Web Scapping, and I am trying to write a very simple script that will receive data from a web page that can only be accessed after logging in. I looked through a few examples, but none fix the problem. This is what I have so far:
from bs4 import BeautifulSoup
import urllib, urllib2, cookielib
username = 'name'
password = 'pass'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'password' : password})
opener.open('WebpageWithLoginForm')
resp = opener.open('WebpageIWantToAccess')
soup = BeautifulSoup(resp, 'html.parser')
print soup.prettify()
Like now, when I print a page, it just prints the contents of the page as if I were not registered. I think the problem is with the way I set cookies, but I'm really not sure because I do not quite understand what is happening with the cookie processor and its libraries. Thank!
Current code:
import requests
import sys
EMAIL = 'usr'
PASSWORD = 'pass'
URL = 'https://connect.lehigh.edu/app/login'
def main():
session = requests.session(config={'verbose': sys.stderr})
login_data = {
'username': EMAIL,
'password': PASSWORD,
'LOGIN': 'login',
}
r = session.post(URL, data=login_data)
r = session.get('https://lewisweb.cc.lehigh.edu/PROD/bwskfshd.P_CrseSchdDetl')
if __name__ == '__main__':
main()