Sparql Exact Regular Expression Matching

I use the following sparql query to retrieve dbpedia pages that match specific infoboxes:

PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbpedia: <http://dbpedia.org/property/> PREFIX res:<http://dbpedia.org/resource/> SELECT DISTINCT * WHERE { ?page dbpedia:wikiPageUsesTemplate ?template . ?page rdfs:label ?label . FILTER (regex(?template, 'Infobox_artist')) . FILTER (lang(?label) = 'en') } LIMIT 100 

In this query line:

 FILTER (regex(?template, 'Infobox_artist')) . 

I get all info boxes that start with an artist like artist_discography and others that I don’t need. My question is: how can I get only regular infoboxes that match exactly "infobox_artist"?

+4
source share
3 answers

Since this is a regular expression, you can limit the search as follows:

 FILTER (regex(?template, '^Infobox_artist$')) . 
  • ^ - start of line
  • $ is the end of the line

in regular expression.

NB: I did not use sparql, so this may not work.

+5
source

While @ beny23's approach works, it is really very inefficient. Using a regular expression to strictly match the exact value (potentially) puts a neon load on the requested endpoint. This is bad practice.

The ?template value is a URI, so you really need to use a value comparison (or even a built-in, as @cygri shows):

 SELECT DISTINCT * { ?page dbpedia:wikiPageUsesTemplate ?template . ?page rdfs:label ?label . FILTER (lang(?label) = 'en') FILTER (?template = <http://dbpedia.org/resource/Template:Infobox_artist> ) } LIMIT 100 

You can still easily adapt this query string in your code to work with various types of infoboxes. In addition: depending on what toolkit you use to create and execute SPARQL queries, you may have some software alternatives to make it easy to reuse the query.

For example, you can create a “prepared query” that you can reuse and bind to a specific value before executing it. For example, in Sesame, you can do something like this:

 String q = "SELECT DISTINCT * { " + " ?page dbpedia:wikiPageUsesTemplate ?template . " + " ?page rdfs:label ?label . " + " FILTER (lang(?label) = 'en') " + " } LIMIT 100 "; TupleQuery query = conn.prepareTupleQuery(SPARQL, q); URI infoboxArtist = f.createURI(DBPedia.NAMESPACE, "Template:Infobox_artist"); query.setBinding("template", infoboxArtist); TupleQueryResult result = query.evaluate(); 

(Aside: let’s show an example using Sesame because I’m on the Sesame development team, but no doubt other SPARQL / RDF tools have similar functions)

+2
source

If all you want to do is a direct string comparison, you do not need a regular expression! It is simpler and faster:

 SELECT DISTINCT * { ?page dbpedia:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_artist> . ?page rdfs:label ?label . FILTER (lang(?label) = 'en') } LIMIT 100 
+1
source

Source: https://habr.com/ru/post/1433376/


All Articles