Cooperative elaboration of knowledge bases on the World-wide web

Jérôme Euzenat
INRIA Rhône-Alpes
Jerome . Euzenat À inrialpes . fr
Updated and translated from the French version appeared in Bulletin de l'AFIA 34:6-9, 1998

These few lines consider the problems tied to the edition of knowledge bases on the World-wide web (web in the following) and their solutions. Knowledge bases and "ontologies" are not distinguished. A set of links to the considered systems is available at the end of the presentation.
The problems tied to web-site indexing or knowledge-enhanced web search are not considered here.

Knowledge bases on the web: why?

A word not often used so far does a remarkable come-back in the management vocabulary: the word "knowledge". It is used about knowledge dissemination on the web or firm knowledge assets. For those who have worked on knowledge representation for years, this come back was not expected. However, the web is an opportunity for everyone to publish the content of knowledge bases (and to write off the amount of work spent in they elaboration) whatever the format of knowledge is. The accumulated knowledge is not solely available for problem solving but can be browsed as a document.

Moreover, formalised knowledge bases, as considered in artificial intelligence, are more than documents. Thus, generating a set of static document from a knowledge base is freezing what still wants to live. It is thus necessary to bring a dynamic dimension to the web allowing to take advantage of the cognitive perspective on knowledge (or the opportunity to process the knowledge) in a documentary context. It is thus possible to access knowledge through queries exploiting the structure of the knowledge formalism (e.g. classification). CGI scripts and embedded HTTP servers are ideal tools for achieving this purpose. Nevertheless, there are other reasons for using the web as a knowledge base interface:

Knowledge can be used without caring of porting problems (HTTP clients are available worldwide);
Knowledge can be updated in a centralised way from a unique server;
Non-specialist users can be reached without specific training (HTML is now a lingua franca);
The knowledge base can be related to its context (bibliography, texts, lexicons, images) through links to other web sites.

Therefore many knowledge base management systems feature a web interface: one can cite Webcokace which takes advantage of a formal specification of the CML language in order to generate automatically a site allowing to navigate through KADS structures.

Editing knowledge bases on the web

In the same way as it is possible to browse a knowledge base, it should be possible to modify it. Editing a knowledge base with a HTTP client is thus a new goal. However, the HTTP protocol is a stateless protocol (i.e. queries must not modify the content of the server). It is thus required to take this limitation into account. Some systems, like WebGrid, remain in the HTTP philosophy by transmitting the content of the knowledge base in each page as hidden fields. Other systems, among which most of the SGBD allowing web edition, drop out this constraint hardly realistic for large scale applications.
Editing is more complex than browsing: as long as the knowledge base is not modified, few errors can occur but as soon as modification is possible, it become necessary to tackle various other problems:

during modification, input errors can occur that must be trapped and acknowledged;
if some objects are deleted, their URL can remain in the client program (as bookmark, or stacked links) and users may try to access them: this must also be detected.

The first system allowing the edition of knowledge structure is probably WebGrid which provides knowledge acquisition through the web. Another outstanding contribution is Ontosaurus.

Knowledge + web = collaboratory?

The availability and the elaboration of knowledge bases on the web let foresee other applications: the creation of distributed collaboratories. This perspective attracts people interested by scientific research environments or ontology construction.

However, the collaborative edition of a knowledge base requires the solution of many technical, legal and social related problems. Among these problems are:

managing the interaction and communication between people;
controling the access to knowledge;
acknowledging properly intellectual property rights;
trapping and correcting errors;
dealing with the concurrent modification of data.

From a technical point of view, the last problem is difficult. Few systems deal with it and several approaches are coexisting. They can be stated as follow:

APECKS allows uncontrolled concurrent edition;
The ``Systematics interactive site'' allows concurrent annotation rather than edition and takes advantage of moderators;
Ontolingua allows to protect edition access by creating private workspaces (sessions) which can be shared by several users. It does not control the modifications made during a session but notifies the modifications to the users [Alemany 1998].
GKB-Editor implements an optimistic control mechanism (based on [Chaundri& 1992]) which enables any user to edit a copy and tries to deal with conflits when the modifications must become effective (commit time);
Co4 allows anyone to edit a private knowledge base and controls the modifications before integration. The users must submit the pieces of knowledge they want to integrate in the consensual knowledge base and a protocol manages (through votes) the submissions.

Conclusion: a challenge for information, interaction and intelligence?

This conclusion is related to the recent reorganisation of research in these fields in France.

The problems raised by the presented problematic are diverse and challenging. While a new "Groupement de recherche" gathering the artificial intelligence, database and human-computer interaction communities is launched, these problems could constitute an ideal field of contribution for the three disciplines on particular issues:

data coherence upon concurrent access and transaction management;
ergonomy and collaborative use (notification and interaction);
comparison of formalised content and conflict management.

Systems and applications

Several URLs are given below (please notify me if you think that you have something to add):

APECKS: (Adaptative Presentation Environment for Collaborative Knowledge Structuring), developped at the university of Nottingham aims at supporting users in the creation of "personal ontologies" by comparing them to those of other people [Tennison& 1998]. To that extent, the system uses an object-based representation language and translate it to WebGrid for comparing the ontologies. It notifies the users of the differences.
Co4: developped at INRIA Rhône-Alpes aims at building cooperatively a knowledge base expressed in the Troeps representation language [Alemany 1998]. To that extent, the users have their own knowledge base and the modification of the consensual knowledge base requires the submission and acceptation of knowledge pieces following a protocol based on the peer-review process [Euzenat 1996b]. Co4 is particularly used in the construction of the Knife knowledge base (dedicated to genetic interactions in Drosophila).
GKB-Editor: developped at SRI, is an object editor based on the "Generic Frame Protocol" (now "Open Knowledge Base Connectivity"). It must be included into a toolbox for ontology construction [Karp& 1997]. GKB-Editor is used in the famous EcoCyc knowledge base on E. coli metabolism developped by Peter Karp.
Ontolingua server: developped at Stanford-KSL is a web based editor of shared ontologies [Farquhar& 1995, 1997]. It has inspired several systems presented here. Knowledge is represented in the KIF and Ontolingua languages. It has been used in order to create various ontologies (around 50 are freely accessible from its site) such as the InterMed medical ontology.
Ontosaurus: developped at the university of southern California (USC/ISI), is a web editing interface to LOOM [Swartout& 1996].
SIS (albibioni.snv.jussieu.fr): developped at the univiversité Pierre et Marie Curie, by Jacques Lebbe's team (famous for having provided on minitel the indentification keys for mushrooms). This site and the technique it is based on allows to cross tables and annotate them.
WebCokace: developped at INRIA Sophia-Antipolis allows to browse among CML descriptions (originating from KADS) but not to edit them [Corby& 1997]. It is used for displaying several classical libraries of KADS models.
WebGrid: developped at the Knowledge Science Institute of the university of Calgary allows the users to formalise their knowledge as "repertory grids". The first HTTP-compliant version [Gaines& 1995] did not store anything on the server and communicated the contents of the knowledge base as hidden fields in the HTML pages and URLs. In order to consider the comparison of knowledge expressed by different users, WebGrid-II allows to store momentaneously the knowledge on the server. The aim of the system is not to edit a common knowledge base, so access control is not considered.
WebOnto: developped at the Knowledge Media Institute of the Open university is an ontology editor available as a Java applet [Domingue 1998]. WebOnto allows open editing on ontologies but requires the locking of an ontology when someone edits it. It is based on the KMI-made frame language OCML.

There are some other interesting sites:
HPKS (High-performance knowledge bases) is a DARPA sponsored project involving most of the american projects mentionned above (and many others).
Aristotle (Automated categorization of Web resources) indexes some projects related to this topic.

References

[Alemany 1998] Christophe Alemany, Étude et réalisation d'une interface d'édition de bases de connaissances au travers du World Wide Web, Mémoire CNAM, Grenoble (FR), 1998

[Chaundri& 1992] Vinay Chaudhri, Vassos Hadzilacos, John Mylopoulos, Concurrency control for knowledge bases, Proc. 3rd KR, Cambridge (MA US), pp762-773, 1992

[Corby& 1997] Olivier Corby, Rose Dieng, A commonKADS expertise model web server, Proc. 5th ISMICK, Compiègne (FR), pp97-117, 1997

[Domingue 1998] John Domingue, Tadzebao and WebOnto: discussing, browsing, and editing ontologies on the web, Proc. 11th KAW, Banff (CA), 1998

[Euzenat 1996b] Jérôme Euzenat, Corporate memory through cooperative creation of knowledge bases and hyper-documents, Proc. 10th KAW, Banff (CA), 1996

[Farquhar& 1995] Adam Farquhar, Richard Fikes, Wanda Pratt, James Rice, Collaborative ontology construction for information integration, Rapport de recherche 63, Knowledge system laboratory, Stanford university, Stanford (CA US), 1995

[Farquhar& 1997] Adam Farquhar, Richard Fikes, James Rice, (1997). The Ontolingua server: a tool for collaborative ontology construction, International journal of human-computer studies 46:707-727, 1997

[Gaines& 1995] Brian Gaines, Mildred Shaw, WebMap: concept mapping on the Web, Proc. 4th WWW conference, Boston (MA US), 1995

[Karp& 1997] Peter Karp, Vinay Chaudhri, Suzanne Paley, A collaborative environment for authoring large knowledge bases, 1997, submitted for publication

[Swartout& 1996] Bill Swartout, Ramesh Patil, K. Knight, T. Russ, Toward distributed use of large scale ontologies, Proc. 10th KAW, Banff (CA), 1996

[Tennison& 1998] Jenifer Tennison, Nigel Shadbolt, APECKS: a tool to support living ontologies, Proc. 11th KAW, Banff (CA), 1998

http://exmo.inrialpes.fr/papers/bullafia98/related.html

$Id: related.html,v 1.2 2021/12/15 20:53:52 euzenat Exp $