Query large RDF datasets from memory

Question

Query large RDF datasets from memory

I want to download two or more datasets on my computer and be able to run the SPARQL endpoint for each. I tried Fuseki, which is part of the Jena project. However, it loads the entire data set into memory, which is not very desirable if I intend to query large arrays of data, such as DBpedia, given that I intend to do other things (starting with several SPARQL endpoints and using a federated query system for them).

Just to give you a head, I intend to link several data sets using SILK , requesting them using the FEDX federated query system. If you recommend any changes to the systems that I use, or can give feedback, that would be great. It will also be useful if you propose a dataset that can fit into this project.

+6

semantic-web jena sparql fuseki

user2467278 Jun 09 '13 at 2:18

source share

2 answers

As Joshua said , Yena Fuseki uses TDB to store very large ontologies without using a lot of resources. For example, you can download <?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head> <variable name="s"/> <variable name="p"/> <variable name="o"/> </head> <results> <result> <binding name="s"> <uri>http://yago-knowledge/resource/wordnet_gulag_103467887</uri> </binding> <binding name="p"> <uri>http://www.w3.org/2000/01/rdf-schema#subClassOf</uri> </binding> <binding name="o"> <uri>http://yago-knowledge/resource/wordnet_prison_camp_104005912</uri> </binding> </result> …

+2

Ali R Jun 10 '13 at 7:51

source share

Joshua taylor · Accepted Answer · 2013-06-09T18:50:47+0000

Jena Fuseki can use TDB as a storage engine, and TDB stores things on disk. The TDB caching documentation for 32-bit and 64-bit Java systems discusses how to map the contents of a file to memory. I do not think TDB / Fuseki loads the entire data set into memory; this is simply not possible for large data sets, but TDB can handle fairly large data sets. I think you should consider using tdbloader to create TDB storage; then you can point to him Fuseki.

Here is an example of TDB storage setup in this answer . There, the query is executed using tdbquery , but according to the Fuseki Server Launch in the documentation, all you need to start Fuseki with the same TDB store is the --loc=DIR option:

--loc=DIR
Use an existing TDB database. Create empty if it does not exist.

Query large RDF datasets from memory

More articles: