Removing unwanted tags using XSL

I have an unknown content included in the description, maybe something like this:

<description> <p> <span> <font>Hello</font> </span> World! <a href="/index">Home</a> </p> </description> 

Any HTML tag can exist. I do not need all the tags. The tags that I want to resolve are p, i, em, strong, b, ol, ul, li and a. So, for example, <font> will be deleted, but <p> and <a> will remain. I assume that I need to match the ones I want (and make sure there is nothing to match the others), but cannot decide how to do this.

Any help?

+4
source share
1 answer

White Lists :

 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="*[not(self::description or self::p or self::i or self::em or self::strong or self::b or self::ol or self::ul or self::li or self::a)]"/> </xsl:stylesheet> 

Please note that this removes unwanted elements and something below them. For example, to simply remove the font element but allow it to be child elements, change the last template as follows:

 <xsl:template match="*[not(self::description or self::p or self::i or self::em or self::strong or self::b or self::ol or self::ul or self::li or self::a)]"/> <xsl:apply-templates/> </xsl:template> 

Equivalent (and slightly cleaner) solution:

 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="@*|node()" priority="-3"> <xsl:copy/> </xsl:template> <xsl:template match="description|p|i|em|strong|b|ol|ul|li|a"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="*"/> </xsl:stylesheet> 

The opposite approach is to blacklist unwanted elements:

 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="font|span"/> </xsl:stylesheet> 

Add apply-templates to the final template again if you want to allow children of missing elements.

+7
source

Source: https://habr.com/ru/post/1393459/


All Articles