The correct way to group multiple queries in SPARQL

I need to get quite a lot of data using a remote endpoint and SPARQL. The problem is that it is terribly slow. I would like to group my queries in order to reduce the impact of network latency on the global performance scheme.

My queries are very simple:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT * WHERE { <my_id> skos:prefLabel ?prefLabel } 

But I'm not sure how to group them correctly. For example, I assume that:

 PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT * WHERE { ?id skos:prefLabel ?prefLabel . FILTER(?id IN ('my_id1', 'my_id2', 'my_id3')) } 

- A terrible idea, as it will force the endpoint to bypass all instances before filtering them.

Any hint on how to implement this group of queries is appreciated.

+4
source share
1 answer

Assuming your endpoint supports SPARQL 1.1, you can use the VALUES clause as follows:

 PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT * WHERE { VALUES ( ?id ) { ( <id1> ) ( <id2> ) ( <id3> ) # etc. } ?id skos:prefLabel ?prefLabel } 

Assuming that the underlying SPARQL mechanism behind your endpoint uses hash joins rather than nested loop joins to evaluate joins with shared variables (I would be very surprised if any modern implementation did not), this should be significantly more productive than the equivalent Form FILTER (?id IN ( <id1>, <id2>, <id3> ) )

NB. A good optimizer can convert the form FILTER (?id IN ( <id1> )) to something similar higher than YMMV depending on the SPARQL mechanism behind your endpoint.

+6
source

Source: https://habr.com/ru/post/1479560/


All Articles