Python: BeautifulSoup - get attribute value based on name attribute

Question

Python: BeautifulSoup - get attribute value based on name attribute

I want to print the value of an attribute based on its name, e.g.

<META NAME="City" content="Austin">

I want to do something like this

 soup = BeautifulSoup(f) //f is some HTML containing the above meta tag for meta_tag in soup('meta'): if meta_tag['name'] == 'City': print meta_tag['content']

The above code gives KeyError: 'name' , I believe this is because the name is used by BeatifulSoup, so it cannot be used as a keyword argument.

+49

python beautifulsoup

Ruth Jun 26 '12 at 10:29

source share

5 answers

theharshest answered the question, but here is another way to do the same. In addition, in your example, you have NAME in the caps, and in your code you have a lowercase name.

 s = '<div class="question" id="get attrs" name="python" x="something">Hello World</div>' soup = BeautifulSoup(s) attributes_dictionary = soup.find('div').attrs print attributes_dictionary # prints: {'id': 'get attrs', 'x': 'something', 'class': ['question'], 'name': 'python'} print attributes_dictionary['class'][0] # prints: question print soup.find('div').get_text() # prints: Hello World

+12

Delicious Mar 12 '14 at 17:56

source share

The best answer is the best solution, but the FYI problem you are facing is that the Tag object in Beautiful Soup acts like a Python dictionary. If you use the ['name'] tag in a tag that does not have the 'name' attribute, you will get a KeyError.

+5

Leonard Richardson Jun 26 '12 at 12:18

source share

The following works:

 from bs4 import BeautifulSoup soup = BeautifulSoup('<META NAME="City" content="Austin">', 'html.parser') metas = soup.find_all("meta") for meta in metas: print meta.attrs['content'], meta.attrs['name']

+2

BrightMoon Mar 23 '17 at 20:40

source share

You can also try this solution:

To find the value that is written in the range of the table

htmlContent

 <table> <tr> <th> ID </th> <th> Name </th> </tr> <tr> <td> <span name="spanId" class="spanclass">ID123</span> </td> <td> <span>Bonny</span> </td> </tr> </table>

Python code

 soup = BeautifulSoup(htmlContent, "lxml") soup.prettify() tables = soup.find_all("table") for table in tables: storeValueRows = table.find_all("tr") thValue = storeValueRows[0].find_all("th")[0].string if (thValue == "ID"): # with this condition I am verifying that this html is correct, that I wanted. value = storeValueRows[1].find_all("span")[0].string value = value.strip() # storeValueRows[1] will represent <tr> tag of table located at first index and find_all("span")[0] will give me <span> tag and '.string' will give me value # value.strip() - will remove space from start and end of the string. # find using attribute : value = storeValueRows[1].find("span", {"name":"spanId"})['class'] print value # this will print spanclass

0

Ujjaval Moradiya Oct 20 '16 at 5:38

source share

theharshest · Accepted Answer · 2012-06-26 10:51

It is quite simple, use the following -

 >>> soup = BeautifulSoup('<META NAME="City" content="Austin">') >>> soup.find("meta", {"name":"City"}) <meta name="City" content="Austin" /> >>> soup.find("meta", {"name":"City"})['content'] u'Austin'

Leave a comment if something is unclear.

Python: BeautifulSoup - get attribute value based on name attribute

More articles: