Is there an easy way to manipulate XML documents in Python?

I did a little research on this, but really could not come up with anything useful. I need to not just parse and read, but actually manipulate XML documents in python, just as JavaScript is able to manipulate HTML documents.

Let me give you an example. I have the following XML document:

<library> <book id=123> <title>Intro to XML</title> <author>John Smith</author> <year>1996</year> </book> <book id=456> <title>XML 101</title> <author>Bill Jones</author> <year>2000</year> </book> <book id=789> <title>This Book is Unrelated to XML</title> <author>Justin Tyme</author> <year>2006</year> </book> </library> 

I need a way to extract an element, either using XPath or using the pythonic method, as described here , but I also need to be able to manipulate the document, for example, below:

 >>>xml.getElement('id=123').title="Intro to XML v2" >>>xml.getElement('id=123').year="1998" 

If anyone knows about such a tool in Python, please let me know. Thanks!

+4
source share
2 answers

lxml allows you to select and manage elements with XPath, as well as manage those elements.

 import lxml.etree as et xmltext = """ <root> <fruit>apple</fruit> <fruit>pear</fruit> <fruit>mango</fruit> <fruit>kiwi</fruit> </root> """ tree = et.fromstring(xmltext) for fruit in tree.xpath('//fruit'): fruit.text = 'rotten %s' % (fruit.text,) print et.tostring(tree, pretty_print=True) 

Result:

 <root> <fruit>rotten apple</fruit> <fruit>rotten pear</fruit> <fruit>rotten mango</fruit> <fruit>rotten kiwi</fruit> </root> 
+9
source

If you want to avoid installing lxml.etree , you can use xml.etree from the standard library.

Here is the Acorn answer ported to xml.etree :

 import xml.etree.ElementTree as et # was: import lxml.etree as et xmltext = """ <root> <fruit>apple</fruit> <fruit>pear</fruit> <fruit>mango</fruit> <fruit>kiwi</fruit> </root> """ tree = et.fromstring(xmltext) for fruit in tree.findall('fruit'): # was: tree.xpath('//fruit') fruit.text = 'rotten %s' % (fruit.text,) print et.tostring(tree) # removed argument: prettyprint 

note: I would put this as a comment on Acorn's answer if I could do it in a clear way. If you like this answer, give Acorn an edge.

+11
source

Source: https://habr.com/ru/post/1379137/


All Articles