Faisal Alkhateeb, Jean-François Baget, Jérôme Euzenat, Constrained regular expressions in SPARQL, Research report 6360, INRIA Rhône-Alpes, Grenoble (FR), 32p., October 2007
RDF is a knowledge representation language dedicated to the annotation of resources within the Semantic Web. Though RDF itself can be used as a query language for an RDF knowledge base (using RDF consequence), the need for added expressivity in queries has led to the definition of the SPARQL query language. SPARQL queries are defined on top of graph patterns that are basically RDF (and more precisely GRDF) graphs. To be able to characterize paths of arbitrary length in a query (e.g., "does there exist a trip from town A to town B using only trains and buses?"), we have already proposed the PRDF (for Path RDF) language, effectively mixing RDF reasonings with database-inspired regular paths. However, these queries do not allow expressing constraints on the internal nodes (e.g., "Moreover, one of the stops must provide a wireless connection."). To express these constraints, we present here an extension of RDF, called CPRDF (for Constrained paths RDF). For this extension of RDF, we provide an abstract syntax and an extension of RDF semantics. We characterize query answering (the query is a CPRDF graph, the knowledge base is an RDF graph) as a particular case of CPRDF entailment that can be computed using some kind of graph homomorphism. Finally, we use CPRDF graphs to generalize SPARQL graph patterns, defining the CPSPARQL extension of that query language, and prove that the problem of query answering using only CPRDF graphs is an NP-hard problem, and query answering thus remains a PSPACE-complete problem for CPSPARQL.
semantic web, query language, RDF, SPARQL, regular expressions