Sparql queries with parentheses throw an exception

I am trying to extract shortcuts from DBpedia for some people. I am partially successful now, but I am stuck in the following problem. The following code works.

public class DbPediaQueryExtractor { public static void main(String [] args) { String entity = "Aharon_Barak"; String queryString ="PREFIX dbres: <http://dbpedia.org/resource/> SELECT * WHERE {dbres:"+ entity+ "<http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),\"en\"))}"; //String queryString="select * where { ?instance <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>; <http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),\"en\")) } LIMIT 5000000"; QueryExecution qexec = getResult(queryString); try { ResultSet results = qexec.execSelect(); for ( ; results.hasNext(); ) { QuerySolution soln = results.nextSolution(); System.out.print(soln.get("?o") + "\n"); } } finally { qexec.close(); } } public static QueryExecution getResult(String queryString){ Query query = QueryFactory.create(queryString); //VirtuosoQueryExecution vqe = VirtuosoQueryExecutionFactory.create (sparql, graph); QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query); return qexec; } } 

However, when an object contains brackets, it does not work. For instance,

 String entity = "William_H._Miller_(writer)"; 

leads to this exception:

Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: "(" "(" "on row 1, column 86.`

What is the problem?

+6
source share
1 answer

It took some copying and pasting to see what exactly was going on. I would suggest that you add new lines to your query for readability. The query you are using is:

 PREFIX dbres: <http://dbpedia.org/resource/> SELECT * WHERE { dbres:??? <http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),"en")) } 

where ??? replaced by the contents of the entity string. You do absolutely no input check here to ensure that the entity value is legal for the insert. Based on your question, it looks like entity contains William_H._Miller_(writer) , so you get a request:

 PREFIX dbres: <http://dbpedia.org/resource/> SELECT * WHERE { dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),"en")) } 

You can paste this into the DBpedia public endpoint and you will get a similar parsing error message:

 Virtuoso 37000 Error SP030: SPARQL compiler, line 6: syntax error at 'writer' before ')' SPARQL query: define sql:big-data-const 0 #output-format:text/html define sql:signal-void-variables 1 define input:default-graph-uri <http://dbpedia.org> PREFIX dbres: <http://dbpedia.org/resource/> SELECT * WHERE { dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),"en")) } 

Better than hitting the DBpedia endpoint with bad queries, you can also use the SPARQL query validation mechanism that reports for this query:

Syntax error: Lexical error in row 4, column 34. Found: ")" (41), after: "writer"

In Jena, you can use ParameterizedSparqlString to avoid such problems. Here is your example, redesigned to use a parameterized string:

 import com.hp.hpl.jena.query.ParameterizedSparqlString; public class PSSExample { public static void main( String[] args ) { // Create a parameterized SPARQL string for the particular query, and add the // dbres prefix to it, for later use. final ParameterizedSparqlString queryString = new ParameterizedSparqlString( "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n" + "SELECT * WHERE\n" + "{\n" + " ?entity rdfs:label ?o\n" + " FILTER (langMatches(lang(?o),\"en\"))\n" + "}\n" ) {{ setNsPrefix( "dbres", "http://dbpedia.org/resource/" ); }}; // Entity is the same. final String entity = "William_H._Miller_(writer)"; // Now retrieve the URI for dbres, concatentate it with entity, and use // it as the value of ?entity in the query. queryString.setIri( "?entity", queryString.getNsPrefixURI( "dbres" )+entity ); // Show the query. System.out.println( queryString.toString() ); } } 

Output:

 PREFIX dbres: <http://dbpedia.org/resource/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT * WHERE { <http://dbpedia.org/resource/William_H._Miller_(writer)> rdfs:label ?o FILTER (langMatches(lang(?o),"en")) } 

You can run this query on a public endpoint and get the expected results . Please note that if you use an entity that does not need special escaping, for example,

 final String entity = "George_Washington"; 

then the query output will use the prefix form:

 PREFIX dbres: <http://dbpedia.org/resource/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT * WHERE { dbres:George_Washington rdfs:label ?o FILTER (langMatches(lang(?o),"en")) } 

This is very convenient because you do not need to check if your suffix, i.e. entity , any characters that need to be escaped; Jena takes care of this for you.

+6
source

Source: https://habr.com/ru/post/951718/


All Articles