If you run the COUNT query to find out how many categories are in dbpedia using the following SPARQL query:
SELECT COUNT(DISTINCT ?category) AS ?count WHERE {?subject dcterms:subject ?category}
You will get dbpedia has 503788 categories. If you request all categories, the endpoint will not give you as many as 503788 categories, as it has a limit on the number of results that you can return. But you can issue multiple queries with LIMIT and OFFSET. For example, to get the first 1000 categories, you can make the following query:
SELECT DISTINCT ?category WHERE {?subject dcterms:subject ?category} LIMIT 1000 OFFSET 0
I donβt know how you are going to use this information, but my recommendation was to run several queries with an increase in the offset (for example, 1000, 2000, 3000) and cache the results in any storage that you use. You can basically write a program that executes queries and puts the results in the cache.
Remember, however, that the categories in DBPedia are hierarchical, so one category is the borader category from several others.
source share