Beautifulsoup: find_all on bs4.element.ResultSet object or list?

Question

Beautifulsoup: find_all on bs4.element.ResultSet object or list?

Hi, so I applied find_all on a beautifulsoup object and found something that is a bs4.element.ResultSet object or list .

I want to continue find_all there, but this is not permitted on the bs4.element.ResultSet object . I can go through each element of the bs4.element.ResultSet object to do find_all. But can I avoid the loop and just convert it back to a beautifulsoup object ?

See the code for more details. Thanks

 html_1 = """ <table> <thead> <tr class="myClass"> <th>A</th> <th>B</th> <th>C</th> <th>D</th> </tr> </thead> </table> """ soup = BeautifulSoup(html_1, 'html.parser') type(soup) #bs4.BeautifulSoup # do find_all on beautifulsoup object th_all = soup.find_all('th') # the result is of type bs4.element.ResultSet or similarly list type(th_all) #bs4.element.ResultSet type(th_all[0:1]) #list # now I want to further do find_all th_all.find_all(text='A') #not work # can I avoid this need of loop? for th in th_all: th.find_all(text='A') #works

+5

python html html-parsing beautifulsoup

Y Zhang Mar 18 '16 at 4:17

source share

1 answer

alecxe · Accepted Answer · 2016-03-19T12:15:41+0000

ResultSet class is a subclass of the list, not the Tag class , which has find* methods. The most common method is to loop based on the results of find_all() :

 th_all = soup.find_all('th') result = [] for th in th_all: result.extend(th.find_all(text='A'))

Typically, a CSS selector can help you solve it in one go, except that not everything you can do with find_all() is possible with select() . For example, bs4 CSS selectors bs4 not have a “textual” search. But, for example, if you needed to find all, say, b elements inside th elements, you could do:

 soup.select("th td")

Beautifulsoup: find_all on bs4.element.ResultSet object or list?

More articles: