MOSIG Master 2ND YEAR Research
YEAR 2015/2016


ADVISOR: Manuel Atencia and Jérôme Euzenat

TEL: +33 (0)476 61 53 55

EMAIL: Manuel:Atencia#inria:fr, Jerome:Euzenat#inria:fr


MASTER PROFILE: Artificial intelligence and the web

Reference number: Proposal n°1926


Reasoning with link keys

The goal of the semantic web is to take advantage of formalised knowledge at the scale of the worldwide web. This has led to the release of a vast quantity of data expressed in semantic web formalisms (RDF) [Heath 2011a]. Part of the added value of linked data lies in the links identifying the same entity in different data sets as it allows for making inference between data sets. For instance, they may identify the same books and articles in different bibliographical data sources. So finding the manifestation of the same entity across several data sets is an important task of linked data.

One way of identifying entities is to use link keys which generalise keys usually found in data bases to several data sets. A link key [Atencia 2014b] is a statement of the form:

( {⟨p1, q1⟩,... ⟨pn, qn⟩} linkkey ⟨c, d⟩ )
stating that whatever an instance of the class c has the same values for properties p1,... pn as an instance of class d has for properties q1,... qn, then these two are the same entity. For example, it may be that a instance of the class Livre is equivalent to an instance of the class Novel as soon as their properties auteur and titre on the one side and creator and title on the other side have the same values. Such keys are slightly more complex than those of databases because in RDF properties are not necessarily functional (they may have several values) and their values may be other objects.

One further difference is that RDF, together with ontologies expressed in the OWL or RDFS languages, is a logic theory. In such a context, a link key is a statement as any other logical statement. As such, it may be deduced or contribute deducing other statements. Indeed, the above key entails:

( {⟨p1, q1⟩,... ⟨pn, qn⟩, ⟨pn+1, qn+1⟩} linkkey ⟨c, d⟩ )
( {⟨p1, q1⟩,... ⟨pn, qn⟩} linkkey ⟨c', d⟩ )
whenever c' is subsumed by c. This applies to the previous examples, if the pair ⟨éditeur, publisher⟩ is added or if Livre is replaced by Roman. These rules may be simply based on set-theory such as Armstrong rules for keys [Armstrong 1974], or based on the logical semantics of the ontology language.

Hence, it is possible to reason on link keys in different ways:

The goal of this master topic is to study reasoning procedures for link keys: what they are, in which conditions they would hold.

For that purpose, the EDOAL language of the Alignment API is used for expressing link keys and an OWL reasoner to reason on the ontologies. Then specific link key reasoning may be implemented on top of these.

Expected results


[Armstrong 1974] William Armstrong, Dependency structures of data base relationships, 6th IFIP Congress, Stockholm (SW), pp580-583, 1974 [Atencia 2014b] Manuel Atencia, Jérôme David, Jérôme Euzenat, Data interlinking through robust link key extraction, Proc. 21st ECAI, Prague (CK), pp15-20, 2014 .
[Heath2011a] Tom Heath and Christian Bizer, Linked Data: Evolving the Web into a Global Data Space, Morgan & Claypool, 2011

$Id: M2R-2015-keyinf.html,v 1.3 2017/01/13 19:59:25 euzenat Exp $