"NoneType" object does not throw a beautifulsoup error when using get_text

I wrote this code to extract all the text from a web page:

from BeautifulSoup import BeautifulSoup
import urllib2

soup = BeautifulSoup(urllib2.urlopen('http://www.pythonforbeginners.com').read())
print(soup.get_text())

The problem is that I get this error:

print(soup.get_text())
TypeError: 'NoneType' object is not callable

Any idea on how to solve this?

+4
source share
2 answers

The method is called soup.getText(), that is, camelCased.

Why you get TypeErrorinstead AttributeError, here is the secret for me!

+6
source

As Markku suggests in the comments, I would recommend breaking your code.

from BeautifulSoup import BeautifulSoup
import urllib2

URL = "http://www.pythonforbeginners.com"
page = urllib2.urlopen('http://www.pythonforbeginners.com')
html = page.read()
soup = BeautifulSoup(html)
print(soup.get_text())

If it still does not work, enter some printing instructions to find out what is happening.

from BeautifulSoup import BeautifulSoup
import urllib2

URL = "http://www.pythonforbeginners.com"
print("URL is {} and its type is {}".format(URL,type(URL)))
page = urllib2.urlopen('http://www.pythonforbeginners.com')
print("Page is {} and its type is {}".format(page,type(page))
html = page.read()
print("html is {} and its type is {}".format(html,type(html))
soup = BeautifulSoup(html)
print("soup is {} and its type is {}".format(soup,type(soup))
print(soup.get_text())
0
source

Source: https://habr.com/ru/post/1528144/


All Articles