[Top] [All Lists]

[ontolog-forum] Semantic interoperability, DL's, Expressive Logics and "

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ali Hashemi <ali.hashemi+ontolog@xxxxxxxxxxx>
Date: Mon, 22 Feb 2010 01:43:37 -0500
Message-id: <5ab1dc971002212243v22e8072fr178559ac19f68399@xxxxxxxxxxxxxx>
Hello all,

In this note, I'll very briefly outline how semantic interoperability can be greatly facilitated if more ontologies were captured in a formal language at least as expressive as first order logic. Note, this doesn't mean that DL's should be abandoned or aren't good or anything negative about them really, nor is to suggest that ontologies should be deployed in full FOL. Rather, we need to capture _as much_ of the semantics of what we're trying to exchange information about in precise, ideally machine readable forms. Nor is this to suggest that natural language augmentation is superfluous or somehow not as important...

I'll begin with a very high level recap of what it means to have a formal ontology. If I am describing a domain in a formalism, I will have statements (axioms) written in some logic. These axioms are interpreted and essentially allow a bunch of "models." We say that the axioms are satisfied by a model iff every statement of the theory holds true for a given model. I speak here of course, about models in the sense of Tarksi: http://en.wikipedia.org/wiki/Model_theory#First-order_logic

Refer to the diagram below. The top plane comprises of a number of bubbles, each of which represents one or more axioms, we call these interchangeably, modules / theories / named sets of axioms. More general modules are to the left and the right arrow constitutes a non-conservative extension of the preceding module (i.e. specialization).

The plane below consists of sets of models corresponding to the above modules (the colours match to the corresponding module). So under an interpretation, the axioms of To are satisfied by the models contained in the set Mo. If a theory is a non-conservative extension of another, all of its models will be a subset of that more general theory.

Now consider the infinite lattice of theories. The upper most module will contain the set of all permissible models given the language of representation.

The quest for primitives in the ontological sense, is finding labels (names in CL parlance, or variables and relations and functions in classic FOL terminology), which constitute a module that "slice" up said model space. Pretty much everything we say will reside in this model space. I should note however, the diagram is a bit simplified, the interactions between the sets of models is actually n-dimensional, but that is well beyond the scope of this note.

Ok, so how does this relate to interoperability. Well for one, It is an open question whether there exists set of modules and primitive names that completely partition or even cover most of the models in this space. Though perhaps all we care about is a really big, much used subset. Moreover, depending on how you look at it, one person's primitives might be another's extensions. But this is also getting off track.

Something productive:

Let us consider the notion of *time* again. A pretty fundamental idea. Now as Pat Hayes showed many years ago in his catalog of time theories, there are at least 14 ways of conceptualizing time in a way that makes sense to most humans. Time can be linearly ordered, partially order, it can be discrete, dense, it can consist of moments or histories etc. etc. Pretty much the only thing they all have in common are time points, oh wait, no, you can also use time intervals. They are both easily mappable into another. Which is more primitive? Are we left with simply saying "there exists time", but everything is pushed down. Now let's say we admit both "timepoint" and "timeinterval" as primitives. However, there still isn't much  we can say about them, since many of the time theories all differ in contradictory ways. So we're pushing even more info down. A question to consider --- when will we stop counting something as a primitive?

Let us imagine a module where the only statement which exists in it is a typing axiom:
(exists (t) (timepoint t))

This would probably be the module for timepoints, with the only axiom that all the time theories using time points can agree on.

This module captures the set of all models where a timepoint is used. In general, or rather in plurality, most theories of time of pretty much agree only on the fact that there are timepoints and perhaps that they are also somehow ordered. So perhaps timepoints are a primitive, and ordering is a primitive? Who knows. What do we gain from having decided that timepoints are primitive? Are we now able to interoperate? No, because we must still figure out how those extensions make our use of differing notions of timepoint. What we've gained is that we are vaguely both talking about timepoints / time, but mapping into this module is only the beginning

And here is where it gets interesting.

However, I must begin by making one thing clear. This note is _not_ a shot at Description Logics. They are tremendously useful and represent some of the most visible, concrete and practical advances in the field of applied ontology. They do however have some limitations, especially when it comes to semantic mappings.

Indeed there is a danger from associating formal ontology with only one type of formalism. Indeed, as many on this list have pointed out on numerous occasions, the second "reality" is reflected through our minds into some language, we are capturing only a fragment of that we hope to represent.

I suggest, though it is certainly not a new idea, that the process of  :
                                                         World <-> Ontologist <-> Software Artifact

requires more than only DL's, or even DL's plus natural language documentation.

The reasoning is simple, a knowledge engineer wants to make whatever knowledge they are handling accessible to others. From the point of view of formal ontology, this means expressing intuitions in some machine readable language or formalism. However, computers can only reason with whatever is explicitly stated in the language of representation.

So if you use a DL, which is designed to ensure/enforce decidability or some other computational objective, you will necessarily, for many applications have to leave a large amount of semantic content external to the system of representation.

Constructing semantic mappings with so much information left essentially unsaid from the computer's point of view is a nightmare. It forces people to try to guess at mappings by looking at the labels other people used in their ontologies (lexical matching), or the structure of their DL statements, or looking at databases and the literals. It's messy work, and the results have been slow.

There is another, much more elegant, and in fact quicker way. Moreover, it doesn't require anyone to fundamentally change how they deploy their technologies. It does incur a bit more work on the part of the ontologist / knowledge engineer (in terms of defining things more precisely, and picking out fragments for deployment), but the payoffs are huge and it is work that will have to eventually be done anyway.

Here are some guidelines. The idea isn't really new; others on the forum have previously echoed very similar sentiments though perhaps not in the same way (John Sowa has spoke of how he and others at VivoMind employ pretty much this idea, Cyc does it implicitly with their specialized algorithms for knowing when to apply a particular reasoning engine)..

The basic idea is this, for any application, you have two ontologies:
(1) The Reference Ontology (RO)
(2) The Deployed Ontology (DO)

RO is a superset of DO.

The deployed ontology is one that might consist only of OWL-DL or OWLFULL or whatever your favourite / corporate mandated tech implementation language is.

The formal ontology on the other hand contains definitions for your vocabulary in a language as expressive as you need (including natural language, but also FOL or even HOL, whatever makes sense in your domain). Remember, computers need to know what you are saying in a language they can understand. This is very important for semantic mappings and interoperability. I would suggest CL, but really anything at least as expressive as FOL i think is a must. Especially for the procedure outlined below to work.


Once you have your Reference Ontology - RO, and the Deployed Ontology DO consists of fragments from RO to satisfy practical / business / technology needs, you can construct your semantic mappings based on the RO, and communicate via your DO's. There are some nuances that i can't get into in this not so short email, but it's relatively straightforward.

On the generating mapping side, here's how it works. Assume you have a repository of ontologies organized intelligently. Say a repostiroy that paid attention to how logical structures evolve and are linked to one another, then you can simply specify query mapping statements like

(forall (A B) (if (ancestor A B)
                     (leq A B))

Meaning -- "Is the *ancestor* relation ranging over people the same as *less-than-or-equal-to*? Etc.

Say two ontologies, O1 and O2 both have deployed notions of Ancestor, in O1, ancestor ranges only over humans, while in O2 it applies to any living organism (including single celled organisms, bacteria etc.).

You could construct independent queries for each linking to the repository. (I'll use the notatoin Rep_O_4 to refer to say module 4 in the ordering hierachy in the repository).

Now the creators of O1 and O2 don't need to know about the other's work. They simply map into the repository and figure out what models their axioms are committing them too. Then one day, they want to share data. Let's say O1 mapped to entry Rep_O_2, while O2 mapped to entry Rep_O_5 (both Ancestors are a type of ordering and the ontologists wanted to know what type).

Assume also that Rep_O_2 is a non-conservative extension of Rep_O_5 (remember Ancestor in O2 allows for people to be born from only on person, it allows more models). So Rep_O_2 permits only a subset of the models of Rep_O_5. You now have a mapping between the two target Ontologies, and you know which theorems would be preserved, and which would not be entailed. And hence you would modify the communication interface between the ontologies accordingly if you want to seamlessly exchange information, i.e. if O1 and O2 want to communicate, messages sent from O2 to O1 using ancestor can only be used when ranging over humans, with XYZ axioms removed or added or whatever...

Now I know that I went through this really quickly. Some clarifications on terminology, when I say O1 mapped into entry Rep_O_2, *mapped* means one of (but not limited to):
  • Relative Interpretation
  • Faithful Interpretation
  • Definable Equivalence
and all their variants. For a bit of detail on what each of those are right now, refer to slide 4 of Michael Gruninger's presentation here: http://ontolog.cim3.net/cgi-bin/wiki.pl?OOR/ConferenceCall_2010_02_19 ; otherwise there is that upcoming FOIS paper which provides all the details for what these mappings are.

Ok, I'll stop here. This email is long and I think i may have lost but the most ardent reader. If you're still here, this is an adapted fragment of another paper we've (myself and Michael Gruninger) almost concluded (not the FOIS one). It is presaged in my master's thesis in the chapters on an architecture for an ontology repository and semantic mappings.

In a nutshell, write more axioms, don't worry about deployment at first. Once you've captured as much of the semantics in a computer readable form, then take fragments of your expressive ontology and deploy it in your favourite DL. When you try to communicate with others, use your Reference Ontologies and generate robust mappings so you know exactly which statements you can trust to exchange and what modifications you might have to tack onto in/out going messages. Discovering and defining these mappings is a #significantly# easier in FOL, where more of the semantics are encoded!!!! Not to mention, a procedure exists to pretty much discover these mappings automatically if you already know "Well ancestor is some sort of order, I just don't know which one or how exactly."

Lastly, don't forget the role that "logical structures" play in interoperability. While we're talking about the real world, we're also talking about computers and the aspects of the real world we've represented on computers. That means there is a lot of inherent sharedness already provided to us, almost for free! :D Something about mediums being messages...


(•`'·.¸(`'·.¸(•)¸.·'´)¸.·'´•) .,.,

Attachment: Modules and Models.jpg
Description: JPEG image

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)

<Prev in Thread] Current Thread [Next in Thread>