Back from ISWC
Trying to get back into the habit of posting here. I’m just back from two weeks of Semantic Web-related travel. First was roughly a week in northern Virginia for the International Semantic Web Conference, followed by a meeting of the SPARQL working group at the W3C Technical Plenary in Santa Clara, California. Lots of good stuff going on recently; I’ve tried to highlight some of it below.
Jesse presenting our Billion Triples Challenge work
Our work on RDF querying on supercomputers was recieved with much interest at the SSWS workshop. Lots of good questions were asked about the approach, and I hope I provided reasonable answers to them (though I think I could have done a better job in presenting the techniques to avoid some of the resulting confusion). The big news was that our Semantic Web Challenge entry (combining the parallel RDF query work, Jesse’s parallel RDFS materialization work, and Medha’s BitMat work) won in the Billion Triples track.
Jesse and I had some good conversations with Jacopo Urbani about his MapReduce-based reasoning system, its similarity to Jesse’s reasoning work, and the overlap with the parallel query answering work (in the context of extending their systems to handle more expressive reasoning).
Members of the SPARQL working group had what I thought was a good panel Q&A at ISWC about what’s coming in SPARQL 1.1. There was some good input (and criticism) of what we’ve got so far, and I hope we can follow up on many of the points made including the issues of efficient federated querying and atomicity versus full ACID trasactions (we discussed some of these issues at the meeting in Santa Clara).
The RDF indexing in Parliament (also presented at SSWS) looked interesting (especially the “average case analysis” that explains some of the tradeoffs of the design and why the design is good for many real-world datasets). Unfortunately, the SSWS proceedings seem to link mistakenly to a draft version of the paper. I’ve put some more thoughts on Parliament and its impact on our clustered RDF query engine on the Tetherless World blog: Parliament, storage density, and napkin math.
Leigh Dodds produced a good looking vocabulary at VoCamp DC for Describing SPARQL Extension Functions. Seems like it would mesh nicely with the SPARQL service descriptions we’re working on for SPARQL 1.1.
Expressing Statistics with RDF does a nice job of explaining how to use the SCOVO vocabulary to describe statistical data. I’ve been using SCOVO to encode statistical descriptions for the data.gov datasets with some success (though I hear the voiD folks are working on a less verbose way to do dataset descriptions).
Silk looks like a great tool to do linking between datasets, and something I hope we can look at for the data.gov RDF work.
Finally, while sitting in on the SPARQL meeting in Santa Clara, Dave Beckett designed a very nice diagram explaining the SPARQL 1.1 Query Execution Sequence. It captures the conceptual ordering of the operations involved in a SPARQL 1.1 query (including aggregates) and I assume maps nicely to many actual implementation (certainly it does to mine).