I am trying to cancel Year and Winners (first and second columns) from the table “List of final matches” (second table) from http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals : I use the following code:
import urllib2 from BeautifulSoup import BeautifulSoup url = "http://www.samhsa.gov/data/NSDUH/2k10State/NSDUHsae2010/NSDUHsaeAppC2010.htm" soup = BeautifulSoup(urllib2.urlopen(url).read()) soup.findAll('table')[0].tbody.findAll('tr') for row in soup.findAll('table')[0].tbody.findAll('tr'): first_column = row.findAll('th')[0].contents third_column = row.findAll('td')[2].contents print first_column, third_column
With the above code, I was able to get the first and sober column just fine. But when I use the same code from http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals
, he could not find the tbody as his element, but I can see the corpse when I check the element.
url = "http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals" soup = BeautifulSoup(urllib2.urlopen(url).read()) print soup.findAll('table')[2] soup.findAll('table')[2].tbody.findAll('tr') for row in soup.findAll('table')[0].tbody.findAll('tr'): first_column = row.findAll('th')[0].contents third_column = row.findAll('td')[2].contents print first_column, third_column
Here is what I got from the comment error:
' --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-150-fedd08c6da16> in <module>() 7 # print soup.findAll('table')[2] 8 ----> 9 soup.findAll('table')[2].tbody.findAll('tr') 10 for row in soup.findAll('table')[0].tbody.findAll('tr'): 11 first_column = row.findAll('th')[0].contents AttributeError: 'NoneType' object has no attribute 'findAll' '
source share