XPath to get all text in an element as a single value, removing line breaks

I am trying to get all the text in a node for the next set and return as a single value (rather than multiple nodes).

<p> "I love eating out." <br> <br> "This is my favorite restaurant." <br> "I will definitely be back" </p> 

I use '/ p' and get all the results, but it returns with line breaks. In addition, attempting "/ p / text ()" results in each text between each tag as a separate return value. Perfect return -

 "I love eating out. This is my favorite restaurant. I will definitely be back" 

I tried looking for other questions, but couldn't find anything as close as possible. Please, not that in the current environment I will restrict only the use of XPath query and I cannot parse or configure any preliminary HTML analysis. In particular, I use the importXML function inside Google Docs.

+6
source share
1 answer

Using

 normalize-space(/) 

When this XPath expression is evaluated, the string value of the node ( / ) document is first created, and this is provided as an argument to the XPath standard function normalize-space() .

By definition, normalize-space() returns its argument with the canceled characters of the leading and trailing adjacent spaces, and any intermediate such group of adjacent whitespace is replaced by a single space.

Evaluation of the above XPath expression results in:

"I like to eat." "This is my favorite restaurant." "I will definitely be back"

To exclude quotation marks, we also use the translate() function :

 normalize-space(translate(/,'&quot;', '')) 

The result of evaluating this expression :

 I love eating out. This is my favorite restaurant. I will definitely be back 

Finally, to quote this result, we use the concat() function :

 concat('&quot;', normalize-space(translate(/,'&quot;', '')), '&quot;' ) 

An evaluation of this XPath expression gives exactly the desired result :

 "I love eating out. This is my favorite restaurant. I will definitely be back" 

XSLT Based Validation :

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="/"> <xsl:value-of select= "concat('&quot;', normalize-space(translate(/,'&quot;', '')), '&quot;' )"/> </xsl:template> </xsl:stylesheet> 

When this conversion is applied to the provided XML document (fixed so that it runs correctly):

 <p> "I love eating out." <br /> <br /> "This is my favorite restaurant." <br /> "I will definitely be back" </p> 

an XPath expression is evaluated, and the result of this evaluation is copied to the output:

 "I love eating out. This is my favorite restaurant. I will definitely be back" 
+7
source

Source: https://habr.com/ru/post/917925/


All Articles