I sometimes have to deal with Beautiful Soup and Requests URLs, which are provided as such:
http://bit.ly/sdflksdfwefwe
http://stup.id/sdfslkjsfsd
http://0.r.msn.com/sdflksdflsdj
Of course, these URLs usually "resolve" the canonical URL as http://real-website.com/page.html . How can I get the last url in the permission / redirect chain?
My code usually looks like this:
from bs4 import BeautifulSoup import requests response = requests.get(url) soup = bs4.BeautifulSoup(response.text, from_encoding=response.encoding) canonical_url = response.???
Please note that I do not want to request http://bit.ly/bllsht to see where it is going, but when I use Beautiful Soup to already parse the returned page, to also get the canonical URL, which was the last to redirect the chain.
Thanks.
source share