Sparql skos:broader_问答_开发者_运维开发者技术经验分享

I'm doing a SPARQL query on the DBpediaset, but I am having some issues (due to lack of detailed SPARQL knowledge) with a query limitation:

I first 'get' all music artists:

?person rdf:type <http://dbpedia.org/ontology/MusicalArtist> .

But I want to limit this to the broader category Category:American_musicians (via traversing skos:broader?): how?

*= while the question is specific, I've encountered this quest many t开发者_JS百科imes when wanting to running sparql queries.

This can be made easier with property paths in SPARQL 1.1

SELECT DISTINCT ( ?person )
WHERE
{
  ?person rdf:type dbpedia-owl:MusicalArtist .
  ?person skos:subject  skos:broader* category:American_musicians  .
}

Here it displays all the ancestors that could be reached via the skos:broader property.

I'm amazed this simple question hasn't been answered correctly in 3 years, and how much uncertainty and doubt people spread.

SELECT * { ?person a dbo:MusicalArtist . filter exists {?person dct:subject/skos:broader* dbc:American_musicians} }

corrected a few prefixes: dbo instead of the long dbpedia-owl, dbc instead of category. These short prefixes are builtin to DBpedia
corrected skos:subject (no such prop exists) to dct:subject
corrected the query with property paths, it was missing /
skos:broader is not transitive, skos:broaderTransitive is. However, DBpedia doesn't have the latter (no transitive reasoning)
replaced DISTINCT which is expensive with FILTER EXISTS which is much faster. The FILTER can stop at the first relevant sub-category it finds, while the original query first finds all such sub-cats per artist, then discards them (DISTINCT), sorts the artists in memory and removes duplicates.

There's no really good way to do this, but here's a verbose way:

SELECT DISTINCT ( ?person )
WHERE
{
  ?person rdf:type dbpedia-owl:MusicalArtist .
  {
    ?person skos:subject [ skos:broader category:American_musicians ] .
  } UNION {
    ?person skos:subject [ skos:broader [ skos:broader category:American_musicians ] ] .
  } UNION {
    ?person skos:subject [ skos:broader [ skos:broader [ skos:broader category:American_musicians ] ] ] .
  } UNION {
    ?person skos:subject [ skos:broader [ skos:broader [ skos:broader [ skos:broader category:American_musicians ] ] ] ] .
  } UNION {
    ?person skos:subject [ skos:broader [ skos:broader [ skos:broader [ skos:broader [ skos:broader category:American_musicians ] ] ] ] ] .
  } UNION {
    ?person skos:subject [ skos:broader [ skos:broader [ skos:broader [ skos:broader [ skos:broader [ skos:broader category:American_musicians ] ] ] ] ] ] .
  } UNION {
    ?person skos:subject [ skos:broader [ skos:broader [ skos:broader [ skos:broader [ skos:broader [ skos:broader [ skos:broader category:American_musicians ] ] ] ] ] ] ] .
  }
}

For figuring out how many levels you need, you can change SELECT DISTINCT to SELECT COUNT DISTINCT and stop adding levels when the count stops going up.

This is really easy to perform in neo4j. An alternative to accomplish your task in SPARQL could be to extract all the subgraph under "Category:American_musicians" by iterating via code on subcategories.

Eg. pseudo code in java would be something like:

String startCategory = "<http://dbpedia.org/resource/Category:American_musicians>";
iterateTraversalFunction(startCategory);

then the traversal function would be:

public void iterateTraversalFunction(String startCategory){
     ArrayList<String> artistsURI = // SPARQL query ?person skos:subject startCategory . ?person rdf:type MusicalArtist 

    ArrayList<String> subCategoriesURI = // SPARQL query ?subCat skos startCategory
    // Repeat recursively
   for(String subCatURI: subCategoriesURI){
       iterateTraversalFunction(subCatURI);
   }
}

Hope this helps, - Dan