There is no generator to โfindโ in BeautifulSoup , from what I know, but we can combine the use of SoupStrainer and .children generator .
Suppose we have this HTML sample:
<div> <item>Item 1</item> <item>Item 2</item> <item>Item 3</item> <item>Item 4</item> <item>Item 5</item> </div>
from which we need to get the text of all item nodes.
We can use SoupStrainer to analyze only item tags, and then iterate over the .children generator and get the texts:
from bs4 import BeautifulSoup, SoupStrainer data = """ <div> <item>Item 1</item> <item>Item 2</item> <item>Item 3</item> <item>Item 4</item> <item>Item 5</item> </div>""" parse_only = SoupStrainer('item') soup = BeautifulSoup(data, "html.parser", parse_only=parse_only) for item in soup.children: print(item.get_text())
Print
Item 1 Item 2 Item 3 Item 4 Item 5
In other words, the idea is to shorten the tree to the desired tags and use one of the available generators , for example .children , you can also use one of these generators directly and manually filter the tag by name or other criteria inside the generator body, for example. sort of:
def generate_items(soup): for tag in soup.descendants: if tag.name == "item": yield tag.get_text()
.descendants generates children recursively, while .children will only consider direct node children.
source share