XPATH to exclude specific URLs using a pattern

I have the following XPATH selection.

// BLOCKQUOTE [@ class = 'postcontent restore'] / A

Now I want to exclude certain links using a wildcard.

Where is the attribute @href! = "Http://domain.com/download.php * '

How to do it?

+4
source share
1 answer

Using

//BLOCKQUOTE[@class='postcontent restore '] /A[@href = 'http://domain.com/download.php'] 

This selects any A element in the XML document, the href attribute is 'http://domain.com/download.php' and is a child of any BLOCKQUOTE element in the XML document, the class attribute has the string value 'postcontent restore '

If you want the selected links to have some kind of URL pointing to this domain, use:

 //BLOCKQUOTE[@class='postcontent restore '] /A[starts-with(@href, 'http://domain.com/download.php')] 

Update . In a comment, the OP clarified:

I want to exclude ... anything starting from this link / url

Using

 //BLOCKQUOTE[@class='postcontent restore '] /A[not(starts-with(@href, 'http://domain.com/download.php'))] 
+3
source

Source: https://habr.com/ru/post/1393473/


All Articles