I changed the web scraping template that I use for most of my Python-based approaches to fit your needs. Confirmed that he worked with my own login information.
The way it works is to simulate a browser and maintain a cookieJar that stores your user session. Got it to work with BeautifulSoup for you.
Note. . This is the version of Python2. I have added a working Python3 example below on request.
import cookielib import os import urllib import urllib2 import re import string from BeautifulSoup import BeautifulSoup username = " user@email.com " password = "password" cookie_filename = "parser.cookies.txt" class LinkedInParser(object): def __init__(self, login, password): """ Start up... """ self.login = login self.password = password
Update June 19, 2014: Added parsing for the CSRF token from the main page for use in the updated login process.
July 23, 2015 Patch. Adding a Python 3 example here. It basically requires replacing library locations and removing obsolete methods. It is not perfectly formatted or anything else, but it is functioning. Sorry for the rush. In the end, the principles and steps are identical.
import http.cookiejar as cookielib import os import urllib import re import string from bs4 import BeautifulSoup username = " user@email.com " password = "password" cookie_filename = "parser.cookies.txt" class LinkedInParser(object): def __init__(self, login, password): """ Start up... """ self.login = login self.password = password
source share