SPARQL* for Wikidata

August 12th, 2019 8:51 AM

I recently asked Olaf Hartig on twitter if he was aware of anyone using RDF* or SPARQL* for modeling qualified statements in Wikidata. These qualified statements are a feature of Wikidata that allow a statement such as “the speed limit in Germany is 100 km/h” to be qualified as applying only to “paved road outside of settlements.” Getting the Most out of Wikidata: Semantic Technology Usage in Wikipedia’s Knowledge Graph by Malyshev, et al. published last year at ISWC 2018 helps to visualize this data:

Visualization of Wikidata qualified statements

Although Olaf wasn’t aware of any work in this direction, I decided to look a bit into what the SPARQL* syntax might look like for Wikidata queries. Continuing with the speed limit example, we can query for German speed limits, and their qualifications:

SELECT ?speed ?qualifierLabel WHERE {
    wd:Q183
        wdt:P3086 ?speed ;
        p:P3086 [
            ps:P3086 ?speed ;
            pq:P3005 ?qualifier ;
        ] .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}

This acts much like an RDF reification query. Using SPARQL* syntax to represent the same query, I ended up with:

SELECT ?speed ?qualifierLabel WHERE {
    << wd:Q183 wdt:P3086 ?speed >>
        pq:P3005 ?qualifier ;
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

This strikes me as a more appealing syntax for querying qualification statements, without requiring the repetition and understanding of the connection between wdt:P3086 and p:P3086. However, that repetition of “P3086” would still be required to access the quantityUnit and normalized values via the psi: and psn: predicate namespaces. I’m not familiar enough with the history of Wikidata to know why RDF reification wasn’t used in the modeling, but I think this shows that there are opportunities for improving the UX of the query interface (and possibly the RDF data model, especially if RDF* sees more widespread adoption in the future).

With minimal changes to my swift SPARQL parser, I made a proof-of-concept translator from Wikidata queries using SPARQL* syntax to standard SPARQL. It’s available in the sparql-star-wikidata branch, and as a docker image (kasei/swift-sparql-syntax:sparql-star-wikidata):

$ docker pull kasei/swift-sparql-syntax:sparql-star-wikidata
$ docker run -t kasei/swift-sparql-syntax:sparql-star-wikidata sparql-parser wikidata -S 'SELECT ?speed ?qualifierLabel WHERE { << wd:Q183 wdt:P3086 ?speed >> pq:P3005 ?qualifier ; SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } }'
SELECT ?speed ?qualifierLabel WHERE {
    _:_blank.b1 <http://www.wikidata.org/prop/statement/P3086> ?speed .
    <http://www.wikidata.org/entity/Q183> <http://www.wikidata.org/prop/P3086> _:_blank.b1 .
    <http://www.wikidata.org/entity/Q183> <http://www.wikidata.org/prop/direct/P3086> ?speed .
    _:_blank.b1 <http://www.wikidata.org/prop/qualifier/P3005> ?qualifier .
    SERVICE <http://wikiba.se/ontology#label>
    {
        <http://www.bigdata.com/rdf#serviceParam> <http://wikiba.se/ontology#language> "en" .
    }
}