Reading the contents of an XML file without having to delete the XML declaration

I want to read all the XML content from a file. The code below only works when deleting an XML declaration ( <?xml version="1.0" encoding="UTF-8"?> ). What is the best way to read a file without removing the XML declaration?

 XmlTextReader reader = new XmlTextReader(@"c:\my path\a.xml"); reader.Read(); string rs = reader.ReadOuterXml(); 

Without deleting the XML reader.ReadOuterXml() returns an empty string.

 <?xml version="1.0" encoding="UTF-8"?> <s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing"> <s:Header> <a:Action s:mustUnderstand="1">http://www.as.com/ver/ver.IClaimver/Car</a:Action> <a:MessageID>urn:uuid:b22149b6-2e70-46aa-8b01-c2841c70c1c7</a:MessageID> <ActivityId CorrelationId="16b385f3-34bd-45ff-ad13-8652baeaeb8a" xmlns="http://schemas.microsoft.com/2004/09/ServiceModel/Diagnostics">04eb5b59-cd42-47c6-a946-d840a6cde42b</ActivityId> <a:ReplyTo> <a:Address>http://www.w3.org/2005/08/addressing/anonymous</a:Address> </a:ReplyTo> <a:To s:mustUnderstand="1">http://localhost/ver.Web/ver2011.svc</a:To> </s:Header> <s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Car xmlns="http://www.as.com/ver"> <carApplication> <HB_Base xsi:type="HB" xmlns="urn:core"> <Header> <Advisor> <AdvisorLocalAuthorityCode>11</AdvisorLocalAuthorityCode> <AdvisorType>1</AdvisorType> </Advisor> </Header> <General> <ApplyForHB>yes</ApplyForHB> <ApplyForCTB>yes</ApplyForCTB> <ApplyForFSL>yes</ApplyForFSL> <ConsentSupplied>no</ConsentSupplied> <SupportingDocumentsSupplied>no</SupportingDocumentsSupplied> </General> </HB_Base> </carApplication> </Car> </s:Body> </s:Envelope> 

Update

I know other methods that use the NON-xml reader (e.g. using File.ReadAllText()) . But I need to know a method that uses the xml method.

+4
source share
5 answers

There can be no text or space before the declaration of the encoding <?xml ?> Other than the specification, and without text between the declaration and the root element other than line break.

Everything else is an invalid document.

UPDATE:

I think your expectation of XmlTextReader.read () is wrong.

Each call to XmlTextReader.Read () goes through the next "token" in the XML document, one token at a time. "Token" means XML elements, spaces, text, and XML encoding declaration.

Your call to reader.ReadOuterXML () returns an empty string because the first token in your XML file is an XML declaration, and there is no OuterXML in the XML declaration.

Consider this code:

  XmlTextReader reader = new XmlTextReader("test.xml"); reader.Read(); Console.WriteLine(reader.NodeType); // XMLDeclaration reader.Read(); Console.WriteLine(reader.NodeType); // Whitespace reader.Read(); Console.WriteLine(reader.NodeType); // Element string rs = reader.ReadOuterXml(); 

The above code produces this output:

 XmlDeclaration Whitespace Element 

The first token is an XML declaration.

The second token encountered is a line break after an XML declaration.

The third token is found in the <s:Envelope> element. From here, calling the .ReadOuterXML () reader will return what I think you expect to see is the text of the <s:Envelope> element, which is the entire soap package.

If you really want to load the XML file into memory as objects, just call var doc = XDocument.Load("test.xml") and do it with a single swipe.

If you are not working with an XML document that is so monstrously huge that it will not fit into system memory, there are really not many reasons to peek into an XML document for one token at a time.

+6
source

What about

 XmlDocument doc=new XmlDocument; doc.Load(@"c:\my path\a.xml"); //Now we have the XML document - convert it to a String //There are many ways to do this, one should be: StringWriter sw=new StringWriter(); doc.Save(sw); String finalresult=sw.ToString(); 
+2
source

EDIT: I assume that you mean that you have text between the document declaration and the root element. If not, please specify.

Without removing additional text, this is simply an invalid XML file. I did not expect this to work. You do not have an XML file - you have something like an XML file, but with extraneous material in front of the root element.

+1
source

IMHO, you cannot read this file. This is because there is plain text before the <s:Envelope> root element, which invalidates the entire document.

+1
source

Are you parsing an XML document as XML just for source? Why?

If you really want to do this, then:

 string rs; using(var rdr = new StreamReader(@"c:\my path\a.xml")) rs = rdr.ReadToEnd(); 

It will work, but I'm really not sure if this is what you really want. It pretty much ignores this XML and just reads the text. Useful for some things, but not many.

0
source

Source: https://habr.com/ru/post/1386757/


All Articles