How to parse and write XML using Python ElementTree without moving namespaces?

Our project exits from the XML up form:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-6.0.0.0" newVersion="7.0.0.0" />
      </dependentAssembly>
    </assemblyBinding>
  </runtime>
  <appSettings>
    <add key="foo" value="default">
    ...
  </appSettings>
</configuration>

He then reads / parses this XML using ElementTree, and then for each application parameter that matches a specific key ("foo"), he writes a new value that he knows that the upstream process is not running (in this case, the key " foo "must have the value" bar ").

The downstream process that processes filtered XML is aaahhhh ... fragile. He expects to receive XML exactly in the form above.

If I parse this XML without registering a namespace, then ElementTree manages my tree, like this in the input:

<configuration xmlns:ns0="urn:schemas-microsoft-com:asm.v1">
  <runtime>
  <ns0:assemblyBinding>
    <ns0:dependentAssembly>
      <ns0:assemblyIdentity culture="neutral" name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" />
      <ns0:bindingRedirect newVersion="7.0.0.0" oldVersion="0.0.0.0-6.0.0.0" />
    </ns0:dependentAssembly>
  </ns0:assemblyBinding>
 </runtime>
 <appSettings>
    <add key="foo" value="default">
    ...
 </appSettings>
</configuration>

, , , . , , , , , , :

<configuration xmlns="urn:schemas-microsoft-com:asm.v1">
 <runtime>
  <assemblyBinding>
    <dependentAssembly>
      <assemblyIdentity culture="neutral" name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" />
      <bindingRedirect newVersion="7.0.0.0" oldVersion="0.0.0.0-6.0.0.0" />
    </dependentAssembly>
  </assemblyBinding>
 </runtime>
 <appSettings>
    <add key="foo" value="default">
    ...
 </appSettings>
</configuration>

XML, , , , xmlns <configuration>, <assemblyBinding>?

, ElementTree, , XML , foo, , , , ?

  • lxml, , , , lxml C, : Python.

  • HTML, , , , ; Python, , , .

  • . .

, ElementTree, " , ", , , , ElementTree xmlns root node .

, , xmlns " node", .

- ​​?

+6
1

, , , Python , xml.etree.ElementTree. :

from xml.etree import ElementTree as ET
from re import findall, sub

def render(root, buffer='', namespaces=None, level=0, indent_size=2, encoding='utf-8'):
    buffer += f'<?xml version="1.0" encoding="{encoding}" ?>\n' if not level else ''
    root = root.getroot() if isinstance(root, ET.ElementTree) else root
    _, namespaces = ET._namespaces(root) if not level else (None, namespaces)
    for element in root.iter():
        indent = ' ' * indent_size * level
        tag = sub(r'({[^}]+}\s*)*', '', element.tag)
        buffer += f'{indent}<{tag}'
        for ns in findall(r'{[^}]+}', element.tag):
            ns_key = ns[1:-1]
            if ns_key not in namespaces: continue
            buffer += ' xmlns' + (f':{namespaces[ns_key]}' if namespaces[ns_key] != '' else '') + f'="{ns_key}"'
            del namespaces[ns_key]
        for k, v in element.attrib.items():
            buffer += f' {k}="{v}"'
        buffer += '>' + element.text.strip() if element.text else '>'
        children = list(element)
        for child in children:
            sep = '\n' if buffer[-1] != '\n' else ''
            buffer += sep + render(child, level=level+1, indent_size=indent_size, namespaces=namespaces)
        buffer += f'{indent}</{tag}>\n' if 0 != len(children) else f'</{tag}>\n'
    return buffer

XML, , render, :

data=\
'''<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-6.0.0.0" newVersion="7.0.0.0" />
      </dependentAssembly>
    </assemblyBinding>
  </runtime>
  <appSettings>
    <add key="foo" value="default" />
  </appSettings>
</configuration>'''

e = ET.fromstring(data)
ET.register_namespace('', "urn:schemas-microsoft-com:asm.v1")
r = ET.ElementTree(e)

XML , , , :

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral"></assemblyIdentity>
        <bindingRedirect oldVersion="0.0.0.0-6.0.0.0" newVersion="7.0.0.0"></bindingRedirect>
      </dependentAssembly>
    </assemblyBinding>
  </runtime>
  <appSettings>
    <add key="foo" value="default"></add>
  </appSettings>
</configuration>

, .. , , , , . !

0

Source: https://habr.com/ru/post/1684857/


All Articles