Is there a C ++ function that replaces the xml Special Character with its escape sequence?

I search a lot on the Internet and have not found a C ++ function that replaced xml Special Character with their escape sequence? Is there something like this?

I know about the following:

Special Character Escape Sequence Purpose & &amp; Ampersand sign ' &apos; Single quote " &quot; Double quote > &gt; Greater than < &lt; Less than 

are there any more? how about entering a hex value such as 0 ร— 00, is that also a problem?

+5
c ++ xml escaping special-characters
Mar 28
source share
6 answers

As already mentioned, you could write your own. For example :

 #include <iostream> #include <string> #include <map> int main() { std::string xml("a < > & ' \" string"); std::cout << xml << "\n"; // Characters to be transformed. // std::map<char, std::string> transformations; transformations['&'] = std::string("&amp;"); transformations['\''] = std::string("&apos;"); transformations['"'] = std::string("&quot;"); transformations['>'] = std::string("&gt;"); transformations['<'] = std::string("&lt;"); // Build list of characters to be searched for. // std::string reserved_chars; for (auto ti = transformations.begin(); ti != transformations.end(); ti++) { reserved_chars += ti->first; } size_t pos = 0; while (std::string::npos != (pos = xml.find_first_of(reserved_chars, pos))) { xml.replace(pos, 1, transformations[xml[pos]]); pos++; } std::cout << xml << "\n"; return 0; } 

Output:

 a < > & ' " string a &lt; &gt; &amp; &apos; &quot; string 

Add an entry to transformations to introduce new transformations.

+6
Mar 28 '12 at 9:20
source share

Writing your own is simple enough, but scanning a string several times to find / replace individual characters can be ineffective:

 std::string escape(const std::string& src) { std::stringstream dst; for (char ch : src) { switch (ch) { case '&': dst << "&amp;"; break; case '\'': dst << "&apos;"; break; case '"': dst << "&quot;"; break; case '<': dst << "&lt;"; break; case '>': dst << "&gt;"; break; default: dst << ch; break; } } return dst.str(); } 

Note. For convenience, I used the C ++ 11 loop for the loop, but you can easily do the same with the iterator.

+10
Mar 28 2018-12-12T00:
source share

These types of functions should be standard, and we should not rewrite them. If you are using VS, look atlenc.h This file is part of the VS installation. Inside the file is the EscapeXML function, which is much more complete than any of the above examples.

+5
Feb 12 '14 at 18:51
source share

There is a function, I just wrote it:

 void replace_all(std::string& str, const std::string& old, const std::string& repl) { size_t pos = 0; while ((pos = str.find(old, pos)) != std::string::npos) { str.replace(pos, old.length(), repl); pos += repl.length(); } } std::string escape_xml(std::string str) { replace_all(str, std::string("&"), std::string("&amp;")); replace_all(str, std::string("'"), std::string("&apos;")); replace_all(str, std::string("\""), std::string("&quot;")); replace_all(str, std::string(">"), std::string("&gt;")); replace_all(str, std::string("<"), std::string("&lt;")); return str; } 
+2
Mar 28 '12 at 9:13
source share

I slightly modified Ferruccio's solution to also eliminate other characters that are in the way, like 0x20, etc. (Located somewhere on the Internet). Tested and working.

  void strip_tags(string* s) { regex kj("</?(.*)>"); *s = regex_replace(*s, kj, "", boost::format_all); std::map<char, std::string> transformations; transformations['&'] = std::string("&amp; "); transformations['\''] = std::string("&apos; "); transformations['"'] = std::string("&quot; "); transformations['>'] = std::string("&gt; "); transformations['<'] = std::string("&lt; "); // Build list of characters to be searched for. // std::string reserved_chars; for ( std::map<char, std::string>::iterator ti = transformations.begin(); ti != transformations.end(); ti++) { reserved_chars += ti->first; } size_t pos = 0; while (std::string::npos != (pos = (*s).find_first_of(reserved_chars, pos))) { s->replace(pos, 1, transformations[(*s)[pos]]); pos++; } } string removeTroublesomeCharacters(string inString) { if (inString.empty()) return ""; string newString; char ch; for (int i = 0; i < inString.length(); i++) { ch = inString[i]; // remove any characters outside the valid UTF-8 range as well as all control characters // except tabs and new lines if ((ch < 0x00FD && ch > 0x001F) || ch == '\t' || ch == '\n' || ch == '\r') { newString.push_back(ch); } } return newString; 

So, in this case there are two functions. We can get the result with something like:

 string StartingString ("Some_value"); string FinalString = removeTroublesomeCharacters(strip_tags(&StartingString)); 

Hope this helps!

(Oh yes: credit for another function is provided to the author of the answer here: How to remove invalid hexadecimal characters from an XML-based data source before creating an XmlReader or XPathDocument that uses the data? )

+1
May 11 '12 at 14:18
source share

It seems that you want to generate XML yourself. I think you need to be much clearer and read the XML specification if you want to succeed. These are the only special XML characters, you say: "I know that there is a more special character, foreign languages โ€‹โ€‹and currency signs" ... they are not defined in XML unless you mean encoding as code points (& # 163; for example). Do you think HTML or some other DTD?

The only way to avoid double coding is to only code things once. If you get the string "& gt", as you know, if it is already encoded, and I wanted to represent the string ">", or I want to represent the string "& gt".

The best way is to present your XML as a DOM (with strings as non-encoded strings) and use an XML serializer like Xerces

Oh, and remember that there is no way to represent characters under 0x20 in XML (except & x9 ;, & xA and & xD; are spaces).

0
Mar 28 '12 at 9:13
source share



All Articles