Python lxml iterfind w / namespace but prefix = None

I want to execute iterfind() for elements with a namespace, but without a prefix. I would call

iterfind([tagname]) or iterfind([tagname], [namespace dict])

I do not need to enter the tag as follows:

"{%s}tagname" % tree.nsmap[None]

More details

I am using the XML response from the Google API. The root node defines several namespaces, including one for which there is no prefix: xmlns="http://www.w3.org/2005/Atom"

It seems that when I try to search on my etrera, everything behaves as I would expect from prefixed items. eg:.

 >>> for x in root.iterfind('dxp:segment'): print x ... <Element {http://schemas.google.com/analytics/2009}segment at 0x1211b98> <Element {http://schemas.google.com/analytics/2009}segment at 0x1211d78> <Element {http://schemas.google.com/analytics/2009}segment at 0x1211a08> >>> 

But when I try to search for something without a prefix, the search does not automatically add a namespace for root.nsmap[None] . eg:.

 >>> for x in root.iterfind('entry'): print x ... >>> 

Even if I try to throw away the namespace map as an optional argument to iterfind , it will not attach the namespace.

+6
source share
1 answer

Try the following:

 for x in root.iterfind('{http://www.w3.org/2005/Atom}entry'): print x 

For more information: read the docs: http://lxml.de/tutorial.html#namespaces

If you do not want to enter this and you want to provide a namespace map, you always need to use a prefix, for example:

 nsmap = {'atom': 'http://www.w3.org/2005/Atom'} for x in root.iterfind('atom:entry', namespaces=nsmap): print x 

(the same thing happens if you want to use xpath)

What prefix is ​​used in the document, if there is one, does not matter that you indicate the full name of the element, or write it to the URI using a curly brace or using the prefix that maps to the URI.

+3
source

Source: https://habr.com/ru/post/891255/


All Articles