Python Beautiful Soup 'NoneType' Object Error

Question

Python Beautiful Soup 'NoneType' Object Error

I use Beautiful Soup to get hyperlinks in the body of web pages. Here is the code I'm using

import urllib2
from bs4 import BeautifulSoup

url = 'http://www.1914-1918.net/swb.htm'
element = 'body'
request = urllib2.Request(url)
page = urllib2.urlopen(request).read()
pageSoup = BeautifulSoup(page)
for elementSoup in pageSoup.find_all(element):
  for linkSoup in elementSoup.find_all('a'):
    print linkSoup['href']

I got the AttributeError attribute when I tried to find hyperlinks for the swb.htm page.

AttributeError: object "NoneType" does not have attribute "next_element"

I am sure that there is a body element and a pair of "a" elements under the body element. But strangely this works well for other pages (e.g. http://www.1914-1918.net/1div.htm ).

This problem haunts me for several days. Can someone point out what I did wrong.

Screenshot

enter image description here

+4

python html beautifulsoup findall

WeimusT Apr 16 '14 at 15:27

source share

3

João Pereira · Answer 1 · 2014-04-16T16:05:14+0000

. :

import urllib2
from bs4 import BeautifulSoup

url = 'http://www.1914-1918.net/swb.htm'
element = 'body'
request = urllib2.Request(url)
page = urllib2.urlopen(request).read()
pageSoup = BeautifulSoup(page)
for elementSoup in pageSoup.find_all(element):
  for linkSoup in elementSoup.find_all('a'):
    print linkSoup['href']

.

Thiago Argolo · Answer 2 · 2014-08-11T22:18:15+0000

, html5lib.

.

: https://bugs.launchpad.net/beautifulsoup/+bug/1184417

LeonPak · Answer 3 · 2016-03-06T14:48:22+0000

Maybe beautifulsoup4 does not fit your Python, try uninstalling beautifulsoup4: pip uninstall beautifulsoup4and install an older version: pip install beautifulsoup4==<version>I am using the version 4.1.3.

Python Beautiful Soup 'NoneType' Object Error

More articles: