How to extract text matching pattern in XPATH?

Question

How to extract text matching pattern in XPATH?

I have data that looks like this:

<value>v13772 @FBst0451145:w&lt;up&gt;1118&lt;/up&gt;; P{GD3649} v13772@ v13773 @FBst0451146:w&lt;up&gt;1118&lt;/up&gt;; P{GD3649} v13773@ </value>

How can I process this string in XPATH to extract all and all @FBst ####### numbers?

I know the xpath matches () function ... but this only returns true or false. Not good if I want a matching string. I searched, but cannot find a satisfactory answer to this problem, which is probably very common.

Thanks!

+6

regex pattern-matching xpath

JD. Aug 1 '12 at 20:31

source share

4 answers

Try

 tokenize(value, '[^0-9]+')

which should return a sequence of tokens, separated by sequences without numbers.

+2

Michael kay Aug 1 '12 at 10:31

source share

Suppose you can also use XQuery. The function get_matches () from the FunctX module should work for you. Download a file that supports your version of XQuery. Then import the module when you need its functionality.

 import module namespace functx = "http://www.functx.com" at "functx-1.0-doc-2007-01.xq"; functx:get-matches(string-join(//text()),'xyz')

+1

Sicco Aug 1 '12 at 20:44

source share

Using Dimitre, a valid regular expression:

 replace(.,'.*?(@FBst\d+).*','$1 ','m')

Although this does not work if the new line does not split each target line, it will do so far.

Thanks everyone!

0

JD. Aug 3 '12 at 17:28

source share

Dimitre novatchev · Accepted Answer · 2012-08-02T05:40:55+0000

In addition to Michael Kay’s good answer, if you want to use only the replace() function, use :

 replace(.,'.*?(@FBst\d+).*','$1')

Result :

 @FBst0451145 @FBst0451146

And if you only need the numbers from the above result, use :

 replace(replace(.,'.*?(@FBst\d+).*','$1'), '[^0-9]+', ' ')

This creates :

  0451145 0451146

How to extract text matching pattern in XPATH?

More articles: