Replacing tags of one type with tags of another in BeautifulSoup

I have a collection of HTML files. I want to iterate over them one by one, editing the label of a certain class. The code I want to change is as follows using the following class names:

<td class='thisIsMyClass' colspan=4> <a id='123' class='thisIsMyOtherClass' href='123'>Put me Elsewhere</a> 

This can happen several times in the same document with different text instead of “Put me in another place”, but always the same classes.

I want to change this as a form:

 <font SIZE="3" COLOR="#333333" FACE="Verdana" STYLE="background-color:#ffffff;font-weight: bold;"> <h2>Put Me Elsewhere</h2> </font> 
 import os for filename in os.listdir('dirname'): replace(filename) def replace(filename): tags = soup.find_all(attrs={"thisIsMyClass"}) 

Not too sure where to go after this or how to deal with an array of tags? Any help is appreciated. Thanks:)

+6
source share
2 answers

It would be much better and more beautiful to prepare the replacement HTML string using a placeholder, find all the td tags with the thisIsMyClass class and use .replace_with() to replace each:

 from bs4 import BeautifulSoup data = """ <table> <tr> <td class='thisIsMyClass' colspan=4> <a id='123' class='thisIsMyOtherClass' href='123'>Put me Elsewhere</a> </td> </tr> </table> """ replacement = """ <font SIZE="3" COLOR="#333333" FACE="Verdana" STYLE="background-color:#ffffff;font-weight: bold;"> <h2>{text}</h2> </font> """ soup = BeautifulSoup(data, 'html.parser') for td in soup.select('td.thisIsMyClass'): td.replace_with(BeautifulSoup(replacement.format(text=td.a.text), 'html.parser')) print soup.prettify() 

Print

 <table> <tr> <font color="#333333" face="Verdana" size="3" style="background-color:#ffffff;font-weight: bold;"> <h2> Put me Elsewhere </h2> </font> </tr> </table> 
+4
source

It is as simple as assigning a name attribute.

 # for quick testing: # tag = BeautifulSoup("<td class='thisIsMyClass' colspan=4><a id='123' class='thisIsMyOtherClass' href='123'>Put me Elsewhere</a>") # tags = [tag] for tag in tags: tag.td.name = "font" tag.font["SIZE"] = 3 del tag.font["class"] ... tag.a.name = "h2" ... print(tag) # <font SIZE="3" colspan="4"><h2 class="thisIsMyOtherClass" href="123" id="123">Put me Elsewhere</h2></font> 

Also the documentation is your friend. This is pretty complete.

+1
source

Source: https://habr.com/ru/post/978950/


All Articles