Beautifulsoup: find_all on bs4.element.ResultSet object or list?

Hi, so I applied find_all on a beautifulsoup object and found something that is a bs4.element.ResultSet object or list .

I want to continue find_all there, but this is not permitted on the bs4.element.ResultSet object . I can go through each element of the bs4.element.ResultSet object to do find_all. But can I avoid the loop and just convert it back to a beautifulsoup object ?

See the code for more details. Thanks

 html_1 = """ <table> <thead> <tr class="myClass"> <th>A</th> <th>B</th> <th>C</th> <th>D</th> </tr> </thead> </table> """ soup = BeautifulSoup(html_1, 'html.parser') type(soup) #bs4.BeautifulSoup # do find_all on beautifulsoup object th_all = soup.find_all('th') # the result is of type bs4.element.ResultSet or similarly list type(th_all) #bs4.element.ResultSet type(th_all[0:1]) #list # now I want to further do find_all th_all.find_all(text='A') #not work # can I avoid this need of loop? for th in th_all: th.find_all(text='A') #works 
+5
source share
1 answer

ResultSet class is a subclass of the list, not the Tag class , which has find* methods. The most common method is to loop based on the results of find_all() :

 th_all = soup.find_all('th') result = [] for th in th_all: result.extend(th.find_all(text='A')) 

Typically, a CSS selector can help you solve it in one go, except that not everything you can do with find_all() is possible with select() . For example, bs4 CSS selectors bs4 not have a β€œtextual” search. But, for example, if you needed to find all, say, b elements inside th elements, you could do:

 soup.select("th td") 
+9
source

Source: https://habr.com/ru/post/1245283/


All Articles