AttributeError: object 'bytes' does not have attribute 'timeout'

import re, urllib.request

textfile = open('depth_1.txt','wt')
print('enter the url you would like to crawl')
print('Usage - "http://phocks.org/stumble/creepy/" <-- with the double quotes')
my_url = input()
for i in re.findall(b'''href=["'](.[^"']+)["']''', urllib.request.urlopen(my_url).read(), re.I):
    print(i)
    for ee in re.findall(b'''href=["'](.[^"']+)["']''', urllib.request.urlopen(i).read(), re.I): #this is line 20!
        print(ee)
        textfile.write(ee+'\n')
textfile.close()

Having examined the solution to the problem, I could not find a solution. The error occurs on line 20 (the AttributeError: 'bytes' object does not have the 'timeout' attribute). I do not quite understand the error, so I am looking for an answer and an explanation of what I did wrong. Thank!

+4
source share
3 answers

From docs to urllib.request.urlopen:

urllib.request.urlopen(url[, data][, timeout])

    Open the URL url, which can be either a string or a Request object.

If it urllib.request.urlopendoes not receive the string, it is assumed that this is a Request object. You are passing a byte connection, so it does not work, for example:

>>> a = urllib.request.urlopen('http://www.google.com').read() # success
>>> a = urllib.request.urlopen(b'http://www.google.com').read() # throws same error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 446, in open
    req.timeout = timeout
AttributeError: 'bytes' object has no attribute 'timeout'

To fix this, convert your bytestring back to str, decode it with the appropriate codec:

>>> a = urllib.request.urlopen(b'http://www.google.com'.decode('ASCII')).read()

Or do not use intestrings in the first place.

+3

, - , , , - , . , , , . , - . , , , , , , -, , , .

+1

These errors are caused by the fact that you cannot use bytestring as a URL, check the encoding of your program

+1
source

Source: https://habr.com/ru/post/1543314/


All Articles