How should I deal with strings in strings that I want to marshal in Java for XML?

How should I parse strings in strings that I want to sort by XML?

I'm having difficulty using Java and JAXB to process strings in XML files that have line breaks. Data is retrieved from the database with the actual line feed characters.

Foo <LF> bar 

Or an additional example:

 Foo\r\n\r\nBar 

Productivity:

 Foo&#xD; &#xD; Bar 

If I just bind this data in XML, I get the literal characters of the string in the output. This is obviously against XML standards, where characters must be encoded to &#xD; . Those. in the XML output file I should see:

Foo &#xD;bar

But if I try to do it manually, I get my ampersand and get the encoding!

Foo &amp;#xD;bar

This is pretty ironic, because the process, which apparently should first encode line breaks, and not, is an attempt at my attempts to encode it manually.

+6
source share
1 answer

The following is an example of the default JAXB behavior with respect to \n and \r :

Java Model (Root)

 import javax.xml.bind.annotation.XmlRootElement; @XmlRootElement public class Root { private String foo; private String bar; public String getFoo() { return foo; } public void setFoo(String foo) { this.foo = foo; } public String getBar() { return bar; } public void setBar(String bar) { this.bar = bar; } } 

Demo code

 import javax.xml.bind.*; public class Demo { public static void main(String[] args) throws Exception { JAXBContext jc = JAXBContext.newInstance(Root.class); Root root = new Root(); root.setFoo("Hello\rWorld"); root.setBar("Hello\nWorld"); Marshaller marshaller = jc.createMarshaller(); marshaller.marshal(root, System.out); } } 

Output

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?><root><bar>Hello World</bar><foo>Hello&#xD;World</foo></root> 

UPDATE

The following are some additional data based on some of the research that I have done.

Common to All JAXB (JSR-222) Implementations

  • If you connect to XMLStreamWriter or XMLEventWriter directly (through Marshaller ) or indirectly (through a potential JAX-RS or JAX-WS provider), then the escaping will be based on the StAX implementation. Woodstokes seems to be avoiding things correctly, but the StAX implementation in the JDK that I use did not.

EclipseLink JAXB (MOXy)

JAXB Reference Implementation

  • The JAXB reference implementation will correctly remove '\ r' when sorting on an OutputStream , but not in Writer , at least in the JDK that I use.
+5
source

Source: https://habr.com/ru/post/951235/


All Articles