Getting identifier names with beautifulsoup
If I had the text:
text = '<span id="foo"></span> <div id="bar"></div>' with text that may change (which may not have any identifiers), how can I use BeautifulSoup to get identifier names regardless of tag name (return ['foo', 'bar']). I am not so good at BeautifulSoup and am confused about this task.
+4
1 answer
You need to get a tag with id attributes, and then return the id attribute values ββto a string, for example.
from BeautifulSoup import BeautifulSoup text = '<span id="foo"></span> <div id="bar"></div>' pool = BeautifulSoup(text) result = [] for tag in pool.findAll(True,{'id':True}) : result.append(tag['id']) and result
>>> result [u'foo', u'bar'] +9