How to extract text matching pattern in XPATH?

I have data that looks like this:

<value>v13772 @FBst0451145:w&lt;up&gt;1118&lt;/up&gt;; P{GD3649} v13772@ v13773 @FBst0451146:w&lt;up&gt;1118&lt;/up&gt;; P{GD3649} v13773@ </value> 

How can I process this string in XPATH to extract all and all @FBst ####### numbers?

I know the xpath matches () function ... but this only returns true or false. Not good if I want a matching string. I searched, but cannot find a satisfactory answer to this problem, which is probably very common.

Thanks!

+6
source share
4 answers

In addition to Michael Kay’s good answer, if you want to use only the replace() function, use :

 replace(.,'.*?(@FBst\d+).*','$1') 

Result :

 @FBst0451145 @FBst0451146 

And if you only need the numbers from the above result, use :

 replace(replace(.,'.*?(@FBst\d+).*','$1'), '[^0-9]+', ' ') 

This creates :

  0451145 0451146 
+7
source

Try

 tokenize(value, '[^0-9]+') 

which should return a sequence of tokens, separated by sequences without numbers.

+2
source

Suppose you can also use XQuery. The function get_matches () from the FunctX module should work for you. Download a file that supports your version of XQuery. Then import the module when you need its functionality.

 import module namespace functx = "http://www.functx.com" at "functx-1.0-doc-2007-01.xq"; functx:get-matches(string-join(//text()),'xyz') 
+1
source

Using Dimitre, a valid regular expression:

 replace(.,'.*?(@FBst\d+).*','$1 ','m') 

Although this does not work if the new line does not split each target line, it will do so far.

Thanks everyone!

0
source

Source: https://habr.com/ru/post/921888/


All Articles