Python Beautiful Soup 'NoneType' Object Error

I use Beautiful Soup to get hyperlinks in the body of web pages. Here is the code I'm using

import urllib2
from bs4 import BeautifulSoup

url = 'http://www.1914-1918.net/swb.htm'
element = 'body'
request = urllib2.Request(url)
page = urllib2.urlopen(request).read()
pageSoup = BeautifulSoup(page)
for elementSoup in pageSoup.find_all(element):
  for linkSoup in elementSoup.find_all('a'):
    print linkSoup['href']

I got the AttributeError attribute when I tried to find hyperlinks for the swb.htm page.

AttributeError: object "NoneType" does not have attribute "next_element"

I am sure that there is a body element and a pair of "a" elements under the body element. But strangely this works well for other pages (e.g. http://www.1914-1918.net/1div.htm ).

This problem haunts me for several days. Can someone point out what I did wrong.

Screenshot

enter image description here

+4
source share
3

. :

import urllib2
from bs4 import BeautifulSoup

url = 'http://www.1914-1918.net/swb.htm'
element = 'body'
request = urllib2.Request(url)
page = urllib2.urlopen(request).read()
pageSoup = BeautifulSoup(page)
for elementSoup in pageSoup.find_all(element):
  for linkSoup in elementSoup.find_all('a'):
    print linkSoup['href']

.

+3

Maybe beautifulsoup4 does not fit your Python, try uninstalling beautifulsoup4: pip uninstall beautifulsoup4and install an older version: pip install beautifulsoup4==<version>I am using the version 4.1.3.

-1
source

Source: https://habr.com/ru/post/1536808/


All Articles