As already mentioned, you can use the remove() method to remove (sub) elements from the tree:
for bad in tree.xpath("//fruit[@state=\'rotten\']"): bad.getparent().remove(bad)
But it removes the element, including its tail , which is a problem if you are processing mixed content documents such as HTML:
<div><fruit state="rotten">avocado</fruit> Hello!</div>
becomes
<div></div>
Which I assume that you do not always want :) I created a helper function to remove only the element and keep its tail:
def remove_element(el): parent = el.getparent() if el.tail.strip(): prev = el.getprevious() if prev: prev.tail = (prev.tail or '') + el.tail else: parent.text = (parent.text or '') + el.tail parent.remove(el) for bad in tree.xpath("//fruit[@state=\'rotten\']"): remove_element(bad)
This way it will save the tail text:
<div> Hello!</div>
Messa Dec 01 '18 at 16:33 2018-12-01 16:33
source share