' with text that may change...">

Getting identifier names with beautifulsoup

If I had the text:

text = '<span id="foo"></span> <div id="bar"></div>' 

with text that may change (which may not have any identifiers), how can I use BeautifulSoup to get identifier names regardless of tag name (return ['foo', 'bar']). I am not so good at BeautifulSoup and am confused about this task.

+4
source share
1 answer

You need to get a tag with id attributes, and then return the id attribute values ​​to a string, for example.

 from BeautifulSoup import BeautifulSoup text = '<span id="foo"></span> <div id="bar"></div>' pool = BeautifulSoup(text) result = [] for tag in pool.findAll(True,{'id':True}) : result.append(tag['id']) 

and result

 >>> result [u'foo', u'bar'] 
+9
source

Source: https://habr.com/ru/post/1446724/


All Articles