Now I am doing the html to xml xslt conversion, pretty tucked forward. But I have one small problem that remains unresolved.
For example, in my original html, node looks like this:
<p class="Arrow"><span class="char-style-override-12">4</span><span class="char-style-override-13"> </span>Sore, rash, growth, discharge, or swelling.</p>
As you can see, the first child node <span> has a value of 4, it actually displays as an arrow point in the browser (maybe some encoding problem, it is considered as a numeric value in my xml editor).
So my question is: I wrote a template to match the tag, and then passed the text content to another template:
<xsl:template match="text()"> <xsl:variable name="noNum"> <xsl:value-of select="normalize-space(translate,'4',''))"/> </xsl:variable> <xsl:copy-of select="$noNum"/> </xsl:template>
As you can see, this is certainly not a good solution, it will replace all the numbers that appear in the string, and not just the first character. So I wonder if there is a way to remove only the first character, if that number, possibly using a regular expression? Or, I'm really mistaken, should there be a better way to solve this problem (for example, change the encoding)?
Any idea is welcome! Thanks in advance!