XPath: get only elements with a specific subitem

Question

XPath: get only elements with a specific subitem

I have a file that is presented in an XML document in the following format:

<xml xmlns="namespace1" xmlns:ns2="namespace2"> <entry> <id>123</id> <ns2:content name="type">directory</ns2:content> <ns2:content name="numErrors">3</ns2:content> </entry> ... <entry> <id>456</id> <ns2:content name="type">file</ns2:content> <ns2:content name="docState">success</ns2:content> </entry> ... </xml>

I need to do, using Python lxml, get only those entry objects that represent directories. All entries contain an <ns2:content name="docState"> , but I need to know how to get a list of entry objects, where this object text is directory . I can do this with a few uncomfortable steps, but for this I would prefer a single request. Here's how I will do it step by step:

 #xml_parse.py ns={'ns1':'namespace1','ns2':'namespace2'} for node in tree.xpath("//ns1:entry",namespaces=ns): if node.find("ns2:content[@name='type']").text=="directory": #do stuff with node pass

Can someone explain how to do this in a for statement instead of using if?

thanks

+4

python xml xpath lxml

ewok Nov 10 '11 at 20:16

source share

1 answer

Wayne burkett · Accepted Answer · 2011-11-10T21:14:17+0000

Use the following XPath expression:

 //ns1:entry[ns2:content[@name='type' and .='directory']]

XPath: get only elements with a specific subitem

More articles: