Content and knowledge representation

Coming from knowledge representation, we still maintain some interest in problems related to it. EXMO mainly considers languages with well-defined semantics, and defines the semantics of some languages such as multimedia specification languages, in order to establish the properties of computer manipulations of the representations. In the past 20 years, the semantics of knowledge representation languages (description logics [Baader 2003], conceptual graphs and object-based languages) has been investigated. Their semantics is usually defined within model theory initially developed for logics. The languages defined for the semantic web (RDF and OWL) follow that approach. RDF is a language for expressing knowledge as graphs on the web. OWL is designed for expressing ontologies: it enable describing concepts and relations that are used within RDF.

We consider a language L as a set of syntactically defined expressions (often inductively defined by applying constructors over other expressions). A representation (oL) is a set of such expressions. It is also called an ontology. An interpretation function (I) is inductively defined over the structure of the language to a structure called interpretation domain (D). This expresses the construction of the "meaning" of an expression in function of its components. A formula is satisfied by an interpretation if it fulfills a condition (in general being interpreted over a particular subset of the domain). A model of a set of expressions is an interpretation satisfying all these expressions. An expression (δ) is then a consequence of a set of expressions (o) if it is satisfied by all of their models (denoted o⊨Lδ).

A computer must determine if a particular expression (taken as a query, for instance) is the consequence of a set of axioms (a knowledge base). For that purpose, it uses programs, called provers, that can be based on the processing of a set of inference rules, on the construction of models or on procedural programming. These programs are able to deduce theorems (denoted o⊢Lδ). They are said to be sound if they only find theorems which are indeed consequences and to be complete if they find all the consequences as theorems. However, depending on the language and its semantics, the decidability, i.e., the ability to create sound and complete provers, is not warranted. Even for decidable languages, the algorithmic complexity of provers may prohibit their exploitation.

To solve this problem a trade-off between the expressivity of the language and the complexity of its provers has to be found. These considerations have led to the definition of languages with limited complexity - like conceptual graphs and object-based representations - or of modular family of languages with associated modular prover algorithms - like description logics.

In collaboration with the Acacia and Orpailleur project teams, we have developed comparisons of several knowledge representation systems (conceptual graphs, object-based representations, and description logics) when applied to indexing documents by content [Euzenat2000a, 2001c, 2002b]. Moreover, we have described several languages in XML (Troeps, Escrire pivot language and several description logics - see below). This will be continued in relation with the exchange of formal knowledge and especially its communication through the web and its contribution to the web.

We have also worked with INA, through the doctoral work of Raphaël Troncy who is linking knowledge representation and audio-visual document description. This solution combines structure and content representation of audiovisual documents by allowing the expression of content in the OWL language and by transforming the structural description, embedded in MPEG-7 within the framework of an OWL ontology for audio-visual documents [Troncy 2003a, b, 2004a]. This system has been implemented and experimented on a corpus of documents on which it was able to answer queries that cannot be answered in one of the other formalisms alone.

We also have participated to the design of the web ontology language OWL, a language derived from description logics recommended by W3C (see WebOnt).

The goal of the semantic web is to share knowledge. In this context, knowledge is expressed in interlinked chunks rather than large monolitic ontologies. Ontologies can be assembled from ontology modules like programme modules in software engineering. In OntoCompo, in cooperation with the Universidade Federal de Pernambuco and Santa Catarina (Frederico Freitas and Guillerme Bittencourt) and in the framework of the NeOn project, we have designed a model of modules which combines an interface and an ontology implementation, in which a module can import other modules through alignments with their interface [Euzenat 2007f]. This is a very natural way to do since alignments can be used to adjust the components in the ontologies. We have provided the semantics of such modules which is a combination of ontology semantics and our own alignment semantics.

Finally, this work is especially used for Semantic interoperability, but shapes the methods used by Exmo.

Goal: Remaining up-to-date, acquiring experience and taking advantage of our knowledge on that topic.

< Motivating applications Index References on objects and content representation Ontology alignment >

© | ? | *

Feel free to comment to Jerome:Euzenat#inria:fr, $Id: content.html,v 1.32 2016/12/25 21:04:59 euzenat Exp $