Python, "urlparse.urlparse (url) .hostname" return None value

Question

Python, "urlparse.urlparse (url) .hostname" return None value

After entering the site I want to collect its links. I do this with this function (using the mechanize and urlparse libraries):

br = mechanize.Browser() . . #logging in on website . for link in br.links(): url = urlparse.urljoin(link.base_url, link.url) hostname = urlparse.urlparse(url).hostname path = urlparse.urlparse(url).path #print hostname #by printing this I found it to be the source of the None value mylinks.append("http://" + hostname + path)

and I get this error message:

  mylinks.append("http://" + hostname + path) TypeError: cannot concatenate 'str' and 'NoneType' objects

I am not sure how to fix this, or even if it can be fixed at all. Is there a way to force a function to be added, even if it creates a non-working and strange result for a value of None?

Alternatively, what I really find in the link is what the link ends with. for example, the html code for one of the links looks like this (what I am behind is the lexik world):

 <td class="center"> <a href="http://UnimportantPartOfLink/lexik>>lexik</a> </td>

therefore, an alternative route would be that if mechanization can simply collect this value directly, bypassing links and problems with a lack of value

+6

python urlparse

user3053161 Dec 01 '13 at 17:26

source share

2 answers

Why not use a try/except block?

 try: mylinks.append("http://" + hostname + path) except TypeError: continue

If there is an error, it will simply skip the add and continue the loop.

Hope this helps!

+4

aIKid Dec 01 '13 at 17:43

source share

Arovit · Accepted Answer · 2013-12-01T18:05:57+0000

Another good way without any attempts other than block is

Replace hostname = urlparse.urlparse(url).hostname with

 hostname = urlparse.urlparse(url).hostname or ''

and similarly path = urlparse.urlparse(url).path with

 path = urlparse.urlparse(url).path or ''

Hope this helps!

Python, "urlparse.urlparse (url) .hostname" return None value

More articles: