Bibliography on Semantic web queries/Interrogation du web sémantique (2017-06-06)
Mustafa Al-Bakri, Manuel Atencia, Steffen Lalande, Marie-Christine Rousset, Inferring same-as facts from linked data: an iterative import-by-query approach, in: Blai Bonet, Sven Koenig (eds), Proc. 29th conference on Conference on Artificial Intelligence (AAAI), Austin (TX US), pp9-15, 2015
In this paper we model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. We describe a novel import-by-query algorithm that alternates steps of sub-query rewriting and of tailored querying the Linked Data cloud in order to import data as specific as possible for inferring or contradicting given target same-as facts. Experiments conducted on a real-world dataset have demonstrated the feasibility of this approach and its usefulness in practice for data linkage and disambiguation.
LOD, Data interlinking
Faisal Alkhateeb, Jérôme Euzenat, Constrained regular expressions for answering RDF-path queries modulo RDFS, International Journal of Web Information Systems 10(1):24-50, 2014
The standard SPARQL query language is currently defined for querying RDF graphs without RDFS semantics. Several extensions of SPARQL to RDFS semantics have been proposed. In this paper, we discuss extensions of SPARQL that use regular expressions to navigate RDF graphs and may be used to answer queries considering RDFS semantics. In particular, we present and compare nSPARQL and our proposal CPSPARQL. We show that CPSPARQL is expressive enough to answer full SPARQL queries modulo RDFS. Finally, we compare the expressiveness and complexity of both nSPARQL and the corresponding fragment of CPSPARQL, that we call cpSPARQL. We show that both languages have the same complexity through cpSPARQL, being a proper extension of SPARQL graph patterns, is more expressive than nSPARQL.
semantic web, query language, RDF, RDFS, SPARQL, nSPARQL, CPSPARQL, cpSPARQL, regular expression, constrained regular expression
Faisal Alkhateeb, Jérôme Euzenat, Answering SPARQL queries modulo RDF Schema with paths, Research report 8394, INRIA Rhône-Alpes, Grenoble (FR), 46p., November 2013
SPARQL is the standard query language for RDF graphs. In its strict instantiation, it only offers querying according to the RDF semantics and would thus ignore the semantics of data expressed with respect to (RDF) schemas or (OWL) ontologies. Several extensions to SPARQL have been proposed to query RDF data modulo RDFS, i.e., interpreting the query with RDFS semantics and/or considering external ontologies. We introduce a general framework which allows for expressing query answering modulo a particular semantics in an homogeneous way. In this paper, we discuss extensions of SPARQL that use regular expressions to navigate RDF graphs and may be used to answer queries considering RDFS semantics. We also consider their embedding as extensions of SPARQL. These SPARQL extensions are interpreted within the proposed framework and their drawbacks are presented. In particular, we show that the PSPARQL query language, a strict extension of SPARQL offering transitive closure, allows for answering SPARQL queries modulo RDFS graphs with the same complexity as SPARQL through a simple transformation of the queries. We also consider languages which, in addition to paths, provide constraints. In particular, we present and compare nSPARQL and our proposal CPSPARQL. We show that CPSPARQL is expressive enough to answer full SPARQL queries modulo RDFS. Finally, we compare the expressiveness and complexity of both nSPARQL and the corresponding fragment of CPSPARQL, that we call cpSPARQL. We show that both languages have the same complexity through cpSPARQL, being a proper extension of SPARQL graph patterns, is more expressive than nSPARQL.
semantic web, query language, query modulo schema, RDF, RDF Schema, SPARQL, regular expression, Constrained regular expression, Path, PSPARQL, NSPARQL, CPSPARQL, cpSPARQL, nSPARQL
Melisachew Wudage Chekol, Jérôme Euzenat, Pierre Genevès, Nabil Layaïda, Evaluating and benchmarking SPARQL query containment solvers, in: Proc. 12th conference on International semantic web conference (ISWC), Sydney (NSW AU), (Harith Alani, Lalana Kagal, Achile Fokoue, Paul Groth, Chris Biemann, Josiane Xavier Parreira, Lora Aroyo, Natalya Noy, Christopher Welty, Krzysztof Janowicz (eds), The semantic web (Proc. 12th conference on International semantic web conference (ISWC)), Lecture notes in computer science 8219, 2013), pp408-423, 2013
Query containment is the problem of deciding if the answers to a query are included in those of another query for any queried database. This problem is very important for query optimization purposes. In the SPARQL context, it can be equally useful. This problem has recently been investigated theoretically and some query containment solvers are available. Yet, there were no benchmarks to compare theses systems and foster their improvement. In order to experimentally assess implementation strengths and limitations, we provide a first SPARQL containment test benchmark. It has been designed with respect to both the capabilities of existing solvers and the study of typical queries. Some solvers support optional constructs and cycles, while other solvers support projection, union of conjunctive queries and RDF Schemas. No solver currently supports all these features or OWL entailment regimes. The study of query demographics on DBPedia logs shows that the vast majority of queries are acyclic and a significant part of them uses UNION or projection. We thus test available solvers on their domain of applicability on three different benchmark suites. These experiments show that (i) tested solutions are overall functionally correct, (ii) in spite of its complexity, SPARQL query containment is practicable for acyclic queries, (iii) state-of-the-art solvers are at an early stage both in terms of capability and implementation.
Faisal Alkhateeb, Jérôme Euzenat, Querying RDF data, in: Sherif Sakr, Eric Pardede (eds), Graph data management: techniques and applications, IGI Global, Hershey (PA US), 2012, pp337-356
This chapter provides an introduction to the RDF language as well as surveys the languages that can be used for querying RDF graphs. Then it reviews some of the languages that can be used for querying RDF and provides a comparison between these query languages.
RDF, RDF Model, Querying RDF, SPARQL, SPARQL Extensions
Melisachew Wudage Chekol, Jérôme Euzenat, Pierre Genevès, Nabil Layaïda, SPARQL query containment under RDFS entailment regime, in: Proc. 6th International joint conference on automated reasoning (IJCAR), Manchester (UK), (Bernhard Gramlich, Dale Miller, Uli Sattler (eds), Proc. 6th International joint conference on automated reasoning (IJCAR), Lecture notes in computer science 7364, 2012), pp134-148, 2012
The problem of SPARQL query containment is defined as determining if the result of one query is included in the result of another one for any RDF graph. Query containment is important in many areas, including information integration, query optimization, and reasoning about Entity-Relationship diagrams. We encode this problem into an expressive logic called the mu-calculus where RDF graphs become transition systems, queries and schema axioms become formulas. Thus, the containment problem is reduced to formula satisfiability. Beyond the logic's expressive power, satisfiability solvers are available for it. Hence, this study allows to exploit these advantages.
Melisachew Wudage Chekol, Jérôme Euzenat, Pierre Genevès, Nabil Layaïda, SPARQL query containment under SHI axioms, in: Proc. 26th American national conference on artificial intelligence (AAAI), Toronto (ONT CA), pp10-16, 2012
SPARQL query containment under schema axioms is the problem of determining whether, for any RDF graph satisfying a given set of schema axioms, the answers to a query are contained in the answers of another query. This problem has major applications for verification and optimization of queries. In order to solve it, we rely on the mu-calculus. Firstly, we provide a mapping from RDF graphs into transition systems. Secondly, SPARQL queries and RDFS and SHI axioms are encoded into mu-calculus formulas. This allows us to reduce query containment and equivalence to satisfiability in the mu-calculus. Finally, we prove a double exponential upper bound for containment under SHI schema axioms.
Melisachew Wudage Chekol, Jérôme Euzenat, Pierre Genevès, Nabil Layaïda, A benchmark for semantic web query containment, equivalence and satisfiability, Research report 8128, INRIA, Grenoble (FR), 10p., July 2012
The problem of SPARQL query containment has recently attracted a lot of attention due to its fundamental purpose in query optimization and information integration. New approaches to this problem, have been put forth, that can be implemented in practice. However, these approaches suffer from various limitations: coverage (size and type of queries), response time (how long it takes to determine containment), and the technique applied to encode the problem. In order to experimentally assess implementation limitations, we designed a benchmark suite offering different experimental settings depending on the type of queries, projection and reasoning (RDFS). We have applied this benchmark to three available systems using different techniques highlighting the strengths and weaknesses of such systems.
Query containment, PSPARQL, Semantic web, RDF, Regular path queries
Melisachew Wudage Chekol, Static analysis of semantic web queries, Thèse d'informatique, Université de Grenoble, Grenoble (FR), December 2012
Query containment is defined as the problem of determining if the result of a query is included in the result of another query for any given dataset. It has major applications in query optimization and knowledge base verification. The main objective of this thesis is to provide sound and complete procedures to determine containment of SPARQL queries under expressive description logic axioms. Further, we implement these procedures to support theoretical results by experimentation. To date, testing query containment has been performed using different techniques: containment mapping, canonical databases, automata theory techniques and through a reduction to the validity problem in logic. In this thesis, we use the later technique to test containment of SPARQL queries using an expressive logic called mu-calculus. In doing so, RDF graphs are encoded as transition systems which preserves its characteristics, and queries and schema axioms are encoded as mu-calculus formulae. Thereby, query containment can be reduced to the validity test in the logic. This thesis identifies various fragments of SPARQL (and PSPARQL) and description logic schema languages for which containment is decidable. Additionally, it provides theoretically and experimentally proven procedures to check containment of those decidable fragments. Finally, this thesis proposes a benchmark for containment solvers. This benchmark is used to test and compare the current state-of-the-art containment solvers.
Containment, static analysis, SPARQL, PSPARQL, entailment regimes, OWL, RDF
Melisachew Wudage Chekol, Jérôme Euzenat, Pierre Genevès, Nabil Layaïda, PSPARQL query containment, Research report 7641, INRIA, Grenoble (FR), 32p., June 2011
Querying the semantic web is mainly done through SPARQL. This language has been studied from different perspectives such as optimization and extension. One of its extensions, PSPARQL (Path SPARQL) provides queries with paths of arbitrary length. We study the static analysis of queries written in this language, in particular, containment of queries: determining whether, for any graph, the answers to a query are contained in those of another query. Our approach consists in encoding RDF graphs as transition systems and queries as mu-calculus formulas and then reducing the containment problem to testing satisfiability in the logic. We establish complexity bounds and report experimental results.
Query containment, PSPARQL, Semantic web, RDF, Regular path queries
Melisachew Wudage Chekol, Jérôme Euzenat, Pierre Genevès, Nabil Layaïda, PSPARQL query containment, in: Proc. 13th International symposium on database programming languages (DBPL), Seattle (WA US), 2011
Querying the semantic web is mainly done through SPARQL. This language has been studied from different perspectives such as optimization and extension. One of its extensions, PSPARQL (Path SPARQL) provides queries with paths of arbitrary length. We study the static analysis of queries written in this language, in particular, containment of queries: determining whether, for any graph, the answers to a query are contained in those of another query. Our approach consists in encoding RDF graphs as transition systems and queries as mu-calculus formulas and then reducing the containment problem to testing satisfiability in the logic.
Query containment, PSPARQL, Semantic web, RDF, Regular path queries
Faisal Alkhateeb, Jean-François Baget, Jérôme Euzenat, Extending SPARQL with regular expression patterns (for querying RDF), Journal of web semantics 7(2):57-73, 2009
RDF is a knowledge representation language dedicated to the annotation of resources within the framework of the semantic web. Among the query languages for RDF, SPARQL allows querying RDF through graph patterns, i.e., RDF graphs involving variables. Other languages, inspired by the work in databases, use regular expressions for searching paths in RDF graphs. Each approach can express queries that are out of reach of the other one. Hence, we aim at combining these two approaches. For that purpose, we define a language, called PRDF (for "Path RDF") which extends RDF such that the arcs of a graph can be labeled by regular expression patterns. We provide PRDF with a semantics extending that of RDF, and propose a correct and complete algorithm which, by computing a particular graph homomorphism, decides the consequence between an RDF graph and a PRDF graph. We then define the PSPARQL query language, extending SPARQL with PRDF graph patterns and complying with RDF model theoretic semantics. PRDF thus offers both graph patterns and path expressions. We show that this extension does not increase the computational complexity of SPARQL and, based on the proposed algorithm, we have implemented a correct and complete PSPARQL query engine.
semantic web, query language, RDF, SPARQL, regular expressions
Faisal Alkhateeb, Jean-François Baget, Jérôme Euzenat, Constrained regular expressions in SPARQL, in: Hamid Arabnia, Ashu Solo (eds), Proc. international conference on semantic web and web services (SWWS), Las Vegas (NV US), pp91-99, 2008
We have proposed an extension of SPARQL, called PSPARQL, to characterize paths of variable lengths in an RDF knowledge base (e.g. "Does there exists a trip from town A to town B?"). However, PSPARQL queries do not allow expressing constraints on internal nodes (e.g. "Moreover, one of the stops must provide a wireless access."). This paper proposes an extension of PSPARQL, called CPSPARQL, that allows expressing constraints on paths. For this extension, we provide an abstract syntax, semantics as well as a sound and complete inference mechanism for answering CPSPARQL queries.
Faisal Alkhateeb, Querying RDF(S) with regular expressions, Thèse d'informatique, Université Joseph Fourier, Grenoble (FR), June 2008
RDF is a knowledge representation language dedicated to the annotation of resources within the Semantic Web. Though RDF itself can be used as a query language for an RDF knowledge base (using RDF semantic consequence), the need for added expressivity in queries has led to define the SPARQL query language. SPARQL queries are defined on top of graph patterns that are basically RDF graphs with variables. SPARQL queries remain limited as they do not allow queries with unbounded sequences of relations (e.g. "does there exist a trip from town A to town B using only trains or buses?"). We show that it is possible to extend the RDF syntax and semantics defining the PRDF language (for Path RDF) such that SPARQL can overcome this limitation by simply replacing the basic graph patterns with PRDF graphs, effectively mixing RDF reasoning with database-inspired regular paths. We further extend PRDF to CPRDF (for Constrained Path RDF) to allow expressing constraints on the nodes of traversed paths (e.g. "Moreover, one of the correspondences must provide a wireless connection."). We have provided sound and complete algorithms for answering queries (the query is a PRDF or a CPRDF graph, the knowledge base is an RDF graph) based upon a kind of graph homomorphism, along with a detailed complexity analysis. Finally, we use PRDF or CPRDF graphs to generalize SPARQL graph patterns, defining the PSPARQL and CPSPARQL extensions, and provide experimental tests using a complete implementation of these two query languages.
Knowledge Representation Languages, RDF(S), Querying Semantic Web, SPARQL, Graph Homomorphism, Regular Languages, Regular Expressions, SPARQL Extensions, PRDF, PSPARQL, CPRDF, CPSPARQL
Faisal Alkhateeb, Sébastien Laborie, Towards extending and using SPARQL for modular document generation, in: Proc. 8th ACM symposium on document engineering (DocEng), São Paolo (BR), pp164-172, 2008
RDF is one of the most used languages for resource description and SPARQL has become its standard query language. Nonetheless, SPARQL remains limited to generate automatically documents from RDF repositories, as it can be used to construct only RDF documents. We propose in this paper an extension to SPARQL that allows to generate any kind of XML documents from multiple RDF data and a given XML template. Thanks to this extension, an XML template can itself contain SPARQL queries that can import template instances. Such an approach allows to reuse templates, divide related information into various templates and avoid templates containing mixed languages. Moreover, reasoning capabilities can be exploited using RDF Schema or simply RDFS.
Faisal Alkhateeb, Antoine Zimmermann, Query answering in distributed description logics, in: Proc. conference on New technologies, mobility and security (NTMS), Paris (FR), (Houda Labiod, Mohamad Badra (eds), Proc. conference on New technologies, mobility and security (NTMS), Paris (FR), Springer Verlag, Heidelberg (DE), 2007), pp523-534, 2007
This paper describes the notion of query answering in a distributed knowledge based system, and gives methods for computing these answers in certain cases. More precisely, given a distributed system (DS) of ontologies and ontology mappings (or bridge rules) written in Distributed Description Logics (DDL), distributed answers are defined for queries written in terms of one particular ontology. These answers may contain individuals from different ABoxes. To compute these answers, the paper provides an algorithm that reduce the problem of distributed query answering to local query answering. This algorithm is proved correct but not complete in the general case.
Faisal Alkhateeb, Jean-François Baget, Jérôme Euzenat, RDF with regular expressions, Research report 6191, INRIA Rhône-Alpes, Grenoble (FR), 32p., May 2007
RDF is a knowledge representation language dedicated to the annotation of resources within the framework of the semantic web. Among the query languages for querying an RDF knowledge base, some, such as SPARQL, are based on the formal semantics of RDF and the concept of semantic consequence, others, inspired by the work in databases, use regular expressions making it possible to search the paths in the graph associated with the knowledge base. In order to combine the expressivity of these two approaches, we define a mixed language, called PRDF (for "Paths RDF") in which the arcs of a graph can be labeled by regular expressions. We define the syntax and the semantics of these objects, and propose a correct and complete algorithm which, by a kind of homomorphism, calculates the semantic consequence between an RDF graph and a PRDF graph. This algorithm is the heart of query answering for the PSPARQL query language, the extension of the SPARQL query language which we propose and have implemented: a PSPARQL query allows to query an RDF knowledge base using graph patterns whose predicates are regular expressions.
semantic web, query language, RDF, SPARQL, regular expressions
Faisal Alkhateeb, Une extension de RDF avec des expressions régulières, in: Actes 8e rencontres nationales sur jeunes chercheurs en inteligence artificielle (RJCIA), Grenoble (FR), pp1-14, 2007
RDF est un langage de représentation de connaissances dédié à l'annotation de ressources dans le cadre du web sémantique. Parmi les langages de requêtes permettant d'interroger une base de connaissances RDF, certains, tels que SPARQL, s'appuient sur la sémantique formelle de RDF et la notion de conséquence sémantique, d'autres, inspirés par des travaux en bases de données, utilisent des expressions régulières permettant de chercher des chemins dans le graphe associé à la base de connaissances. Afin de conjuguer l'expressivité de ces deux approches, nous définissons un langage mixte, appelé PRDF (pour "Paths RDF") dans lequel les arcs d'un graphe peuvent être étiquetés par des expressions régulières. Nous définissons la syntaxe et la sémantique de PRDF, et proposons un algorithme correct et complet qui, par un homomorphisme particulière, calcule la conséquence sémantique entre un graphe RDF et un graphe PRDF. Cet algorithme est au c\oe{}ur de l'extension du langage de requêtes SPARQL que nous proposons et avons implémenté: une requête PSPARQL permet d'interroger une base de connaissances RDF en utilisant des patterns dont les prédicats sont des expressions régulières.
Faisal Alkhateeb, Antoine Zimmermann, Répondre à des requêtes dans un système distribué à base de connaissances, in: Yves Demazeau, Jérôme Euzenat, François Jacquenet, Laurent Vercouter (éds), Actes atelier sur Intelligence artificielle et web intelligence (IAWI), Grenoble (FR), ppno pagination, 2007
Un système distribué à base de connaissances comportent un ensemble d'ontologies, reliée entre elles par des relations sémantiques. Nous nous intéressons aux réponses à une requête posée en termes d'une ontologie d'un tel système. Ces réponses peuvent comporter des individus de différentes ontologies. Pour évaluer ces réponses, nous présentons deux méthodes avec leurs avantages et leurs inconvénients.
Réponses à des requêtes, Base de connaissances distribuée, Logiques de description
Faisal Alkhateeb, Jean-François Baget, Jérôme Euzenat, Constrained regular expressions in SPARQL, Research report 6360, INRIA Rhône-Alpes, Grenoble (FR), 32p., October 2007
RDF is a knowledge representation language dedicated to the annotation of resources within the Semantic Web. Though RDF itself can be used as a query language for an RDF knowledge base (using RDF consequence), the need for added expressivity in queries has led to the definition of the SPARQL query language. SPARQL queries are defined on top of graph patterns that are basically RDF (and more precisely GRDF) graphs. To be able to characterize paths of arbitrary length in a query (e.g., "does there exist a trip from town A to town B using only trains and buses?"), we have already proposed the PRDF (for Path RDF) language, effectively mixing RDF reasonings with database-inspired regular paths. However, these queries do not allow expressing constraints on the internal nodes (e.g., "Moreover, one of the stops must provide a wireless connection."). To express these constraints, we present here an extension of RDF, called CPRDF (for Constrained paths RDF). For this extension of RDF, we provide an abstract syntax and an extension of RDF semantics. We characterize query answering (the query is a CPRDF graph, the knowledge base is an RDF graph) as a particular case of CPRDF entailment that can be computed using some kind of graph homomorphism. Finally, we use CPRDF graphs to generalize SPARQL graph patterns, defining the CPSPARQL extension of that query language, and prove that the problem of query answering using only CPRDF graphs is an NP-hard problem, and query answering thus remains a PSPACE-complete problem for CPSPARQL.
semantic web, query language, RDF, SPARQL, regular expressions
Jérôme Euzenat, Semantic web semantics, Lecture notes, université Joseph Fourier, Grenoble (FR), 190p., 2007
Faisal Alkhateeb, Jean-François Baget, Jérôme Euzenat, Complex path queries for RDF graphs, in: Proc. ISWC poster session, Galway (IE), ppPID-52, 2005