Re: [ontolog-forum] Ontology similarity and accurate communication

To:	"'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From:	"Patrick Cassidy" <pat@xxxxxxxxx>
Date:	Fri, 7 Mar 2008 09:43:48 -0500
Message-id:	<016201c88061$a76b28a0$f64179e0$@com>

Pat Hayes responded to a post on this thread.

OK, let’s try to clarify the points that PatH either misunderstood or just disagrees with.

[[1]] First, “definition”

[PC] > An issue that has occupied some of my attention lately has been the

Ø question of what basic ontological concepts are sufficient to support

Ø accurate communication. I frame the issue as being analogous to the

Ø "defining vocabulary" used by some dictionaries as a controlled vocabulary

Ø with which they define all of their words. For the Longman's, it is around

Ø 2000 words. The analogous question is how many fundamental ontological

Ø elements (types/classes, and relations/functions) are needed to logically

Ø specify the meanings of all other terms used in a reasonably complex domain

Ø (having perhaps 100,000+ terms), to some adequate level of detail?

[PH] > >

> Define "define". If you mean logically define, then the number of defining terms is going to be comparable to the total number of terms. All 'natural kind'

> terms, for example, are definitionally primitive.

When I used “define” in reference to dictionary definitions, the meaning should be clear, and ample examples are available. For the “Conceptual defining vocabulary” , the foundation ontology nicknamed thus to emphasize its **analogy** to the linguistic defining vocabulary used in dictionaries, I carefully used the term “logically specify the meaning” because I am aware that using the term “define” in an ontology context sets off fire alarms among those who can only interpret that word as meaning “necessary and sufficient definition”. I suppose that calling it a “Conceptual Defining Vocabulary” may cause a reaction in those who are sensitive to the specialized meaning of “define” in logic, but I thought that the distinction was clear. Nevertheless, I will continue to use “conceptual defining vocabulary” because I think it is a useful analogy, even though the “meanings” of the terms in the ontologies created using the foundation ontology will rarely be specified as necessary and sufficient logical “definitions”.

To “specify the meaning” of a subtype is to assert the necessary conditions for membership, for example:

(1) Asserting it as a subtype

(2) Asserting some property or relation not held by other subtypes – this could be as simple as being a subtype of another type.

(3) If there are necessary properties or relations not inherited from any of its parent types, they must also be asserted, using terms already in the conceptual defining vocabulary (aka foundation ontology)

(4) If a subtype is precisely the intersection of two parent types, that would be a necessary and sufficient definition. Of course, the parent types may be primitives.

(5) Occasionally, other means of specifying necessary and sufficient conditions will be used.

There are several criteria which I use to decide whether a newly created concept representation is or is not primitive. I have mentioned those before, and won’t reiterate here.

[[2]] how much vocabulary do we have in common?

[pc] >> My own suspicion is that the similarity **in the fundamental concepts**
>> has to be very close to 100%.

[PH]

Ø There is quite convincing evidence that this is NOT the case. In particular, human beings seem to

Ø communicate adequately even while the terms they use in the communication are based on

Ø different mental ontologies. This happens so often that it seems to be the norm rather than

Ø the exception; it is part of what makes it so hard for reasonable people to come to agreement

Ø on basic ontological questions (such as whether time is a dimension or not).

This is precisely the point that I dispute, that the communication is based on different ontologies. This is worth some discussion. The hypothesis I presented is that the part that **is understood** is based on *exactly* the same ontology, and where the ontologies differ there is misunderstanding. In fact, I can’t imagine how it can logically be otherwise. Different different interpretation (at least when the parts that are different are part of the interpretation process). But, **please** note that I said that what is similar in people are the **fundamental concepts** - among which are the naïve physics sort of notions that a weighty object will fall if not supported, when near the earth’s surface. The point is that there are lots of these, including how people interact with each other, that we learn from experience as we grow up. And when we put labels on those common experiences, we have the most basic ontology, which is extended by analogy to less concrete things. The exact process by which the analogical extensions are created is a fascinating subject of research, but for the moment I am only concerned with the end result – we all have a large (> 1000) body of common concepts for which we have terms that can be understood precisely, when uttered in a familiar context (don’t get started on that – conversational and situational context, OK?). So it appears that you disagree. But on what basis?

Funny thing you should mention time and “dimension” because that is precisely one of the good illustrations of this point. When I did an exercise, creating linguistic (dictionary-style) definitions of 500 words not in the Longman’s defining vocabulary, the one word I decided that I needed that was not itself in the original vocabulary was “dimension”. They didn’t need it to define (the way they do their definitions) any of the 65,000 words in the dictionary. Also to the point, 5-year olds who know thousands of words and a lot of naïve physics and naïve sociology will typically not have a clue as to what a “dimension” is. That is a word we learn in school. At least in my case I am sure I didn’t have a need for it until I learned some geometry. Ergo, the reason that your disputants couldn’t agree on whether time is a “dimension” is because “dimension” is not one of the fundamental concepts that are lexicalized in the basic vocabulary. It’s not part of the linguistic defining vocabulary of words we all agree on. It’s needed for math, and in that context, describing visualizable two or three-dimensional spaces, is probably uncontroversial. When used by analogy to refer to other concepts, I am not surprised that terminology disputes can arise.

Of course, words differ in their meaning depending on context, and that will screw up any but the most carefully planned experiments. People who have no clue as to what an “ontology” is communicate quite well. – in discussing accuracy of linguistic communication, we are talking about the implied ontologies people use in communicating, not the logic-based systems that computational ontologists create. So with respect to your fascinating comment:

>> the terms they use in the communication are based on different mental ontologies.

Just which terms are based on “different ontologies”?? How did the researchers extract these “ontologies”? How did they distinguish different *ontologies* from different interpretations of *word sense*? This is interesting. References??

If we can find the terms that differ, that could provide us with clues as to what words are not in the common basic vocabulary.

[[3]] Relative speed of brain and computer

[PC] >> The reasoning is something like this: if the
>> brain (or a simulation) does as much computation as one of our laptops, then
>> it can run at least 1 million inferences per second.

[PH] > There is no evidence whatever that the brain is capable of this. In fact, there near-conclusive evidence

> that the human brain is incapable of quite simple logical inferences without external

> support (such as drawing a diagram or writing a formula). In particular, people - even

> with logical training - consistently make systematic errors when given

> simple modus tollens reasoning tasks, with confidences in the 90% range.

First, note that I said ***if*** the brain does as much computation. It’s true, I think it does more, but of course not in a linear binary manner. Second, and more importantly, I did *not* say that the brain does **conscious and sequential** modus ponens inferencing of the type we do with our logic formalisms. It does its inferencing by massively parallel signal processing. With >10**11 neurons, each with > 1000 connections to other neurons, firing at multiple times per second, it hardly seems like a bold assertion to imagine that it will accomplish at least the FOL equivalent of 1 million inferences per second. The visual system processes images multiple times per second to a level of detail that our fastest computers cannot yet mimic. When a baseball batter sees a ball coming at him for a few tenths of a second and starts swinging, just how inference-equivalents do you think it would take to process the image, calculate the proper swing, and send the signals to the proper muscles? Less than 300,000? Then you should be able to write a program to do that really easily, at 2 gigahertz.

[[4]] How close are the ontologies of two different people

[PC] >> A similarity of 99.9% in two different fundamental
>> ontologies may not be enough for any meaningful level of communication.

[PH] > Look, this HAS to be nonsensical. If this really were true, how could human

beings EVER succeed in communicating?

Because our basic ontologies are in fact closer than that.

[PH} > There is no way you and I could come to a logically secure agreement

> on 1000 axioms even if we had months to do it in and had nothing else to do.

> But children learn natural language at a rate of around 20 new words per day around

> the ages of 4-6 years old, pretty much all by themselves.

Talk about nonsense. If a child learned language by itself, why don’t the feral children speak the native language of their area fluently? Yes, children learn first largely by pointing to instances, but when a young child visits a zoo and points to a tiger and says “pussy cat” we know that her internal definition of Felidae has not yet reached the level of detail of an adult. The internal meanings of words do get refined as people have more experience of the world and learn more about the language.

First, in talking about the shared common ontology expressed in language, I am discussing the kind of basic notions people use in general communication, and in particular when trying to explain things clearly to one another. These words are associated with their meanings by experience in context, not by discussions among opinionated academics. The intended meanings of the words people use when they are trying to be clear are almost always understood properly – because the words are chosen to minimize the potential for misinterpretation – the speaker knows how the listener will interpret those words. The issue remains, when I ask you to explain the meaning of something, how could I possibly understand the answer unless you used words whose meanings we both agree on? Yet that happens every day many times a day to almost everyone. By age 5 or 6, children are capable of learning by being told. They have a good part of the basic defining vocabulary by that point. I suspect that the basic vocabulary isn’t near its adult level until the teens.

Can you and me agree on an ontology? Well, you are, let us say, contentious, but I still believe that we could come to agreement on a lot more than 1000 axioms, provide that both of us had the time to discuss it and neither of us insists on throwing out something that the other proposes on any grounds other than a logical inconsistency with other parts of the ontology, or a fatal vagueness that makes it impossible to relate one to the other. The biggest reason people disagree on ontologies (when they take the time to actually argue the points, and ignoring the most common source of disagreement – different use of terms) are either due to some insistence on only one means of expressing a concept, even though logically consistent alternatives may be preferred by others – or simple personal preference. Many differences revolve around different theories of certain physical objects or processes. Those can be represented with logical consistency by categorizing them as alternative theories (Newton, Einstein), each useful in certain situations, neither necessarily expressing the fundamental structure of (whatever). You yourself have expressed the opinion that the differences among existing ontologies are smaller than they are generally considered to be, and logical consistency can be demonstrated by bridging axioms among different ways of expressing the same idea. If we allow logically consistent ways to express the same concept, or include slight variations on a concept, agreement will be much easier. In the SUMO project, one of the problems was that Adam and Ian took a minimalist view of the upper ontology. They wanted to include only one way of representing things in the world. In part, they had to produce a deliverable in a specified time frame, which forced them to avoid extended discussion. In addition, their concern, as I interpreted it, was that any redundancy would eventually result in an intractable complexity of reasoning. But that doesn’t mean that what they had in the ontology was in any sense *wrong*. In some cases, it just didn’t satisfy the *preferences* of others. If they had adopted the tactic of putting in whatever anyone wanted, properly related to the other parts, it may have had a wider acceptance. The lack of time and funding for wider participation was, I believe, also a major problem.

Do an experiment. Go to the Longman dictionary site:

http://www.ldoceonline.com/

. . . and look at some of the definitions that they give for words you select. (Don’t worry too much about whether the definitions actually give a good mental picture of the meaning of the word in all its nuance and detail - they are designed to be simple, for learners of English as a second language). The question is, do you think that there are ambiguities in the defining words that make you uncertain as to what they intend by those definitions? If you find any case where you think that you don’t properly understand the meanings of the words they use, let us know which ones.

[[5]] how do native speakers learn to agree?

[[PC]] >> We all know that people differ in assumptions and beliefs, and yet we do
manage to communicate reasonably well in most cases. How can that be?

[PH] > Because the assumptions and beliefs themselves are irrelevant to the communication:

> all that matters is that we come to agreement on the beliefs we EXPRESS to one another.

(PC – I thought that’s what the next part of the paragraph was saying)

[PC] >> Well, it happens probably because we **know** that we have different
assumptions and beliefs, and when communicating, only assume that there is a

certain fundamental set of knowledge in common, and only rely on that basic
set of common assumptions and beliefs to express the ideas we want to

communicate.

[PH] No, that doesn't work. First, people don't know this.

[PC] Oh, come on. We all learn at an early age that people believe different religions, and even different politics. Everyone knows that many other people are not experts in the field we are experts in. We encounter differences of opinion as soon as we can talk. When a specialist tries to explain something to a non-specialist, s/he doesn’t usually use technical jargon (unless s/he is oblivious to the fact that it is jargon). I don’t understand this objection. How would you succinctly describe the Gricean conversational maxims?

[PH] Second, how is agreement on this vital common 'core' achieved?

[PC] By growing up in an environment where the basic words are used consistently to refer to the same common experiences that every person has, regardless of where we grow up. We don’t get together around a table to agree on the basic words. We absorb them by living in an environment where they are used consistently in the same sense, in a particular situation. We need to see them used in a common linguistic environment to begin to grasp the meaning.

===========================================

To understand the points above, recall that I am talking about the basic vocabulary shared by most native speakers of a language, not the technical words that we learn on special topics – even something so fundamental-seeming as “dimension” or “identity” (also not in LDOCE defining vocabulary). There are plenty of those common words, and plenty more of the specialized ones that can be defined (dictionary definition) using the common ones, though in some cases a picture makes understanding a lot easier.

When building a foundation ontology to serve as a Conceptual Defining Vocabulary, we will of course create abstractions that are not lexicalized in the linguistic defining vocabulary, in order to analyze the more common concepts into their more fundamental component parts. This presents an opportunity/risk for different ways of analyzing the structure of common basic concepts – like “event’. But when the most fundamental components of meaning are extracted and represented, the different ways that ontologists may prefer to represent the same thing can all be represented with the most basic concepts, and the relation between those different ways of viewing the same thing can be precisely specified. I am aware that merely saying this will not convince anyone. It requires a project to actually test the hypothesis. That was also a subject of a previous posting. I wish that the IKRIS project had continued – I think it would have produced some important data related to this hypothesis.

PatC

Patrick Cassidy

MICRA, Inc.

908-561-3416

cell: 908-565-4053

cassidy@xxxxxxxxx

From: Pat Hayes [mailto:phayes@xxxxxxx]
Sent: Thursday, March 06, 2008 4:15 PM
To: [ontolog-forum]
Cc: edbark@xxxxxxxx; Patrick Cassidy
Subject: Re: [ontolog-forum] Ontology similarity and accurate communication

At 8:00 AM -0500 3/6/08, Patrick Cassidy wrote:

In the discussion on "orthogonal", Ed Barkmeyer pointed out:

> My position is that two agents don't need to have non-overlapping
> ontologies to be unable to communicate effectively. Their ontologies
> can have a 90% overlap, but if there is one critical idea that one has
> and the other does not understand, they can't do business.
>

Ed focused on the problem that arises when one 'critical idea' differs
between the ontologies (or assumptions) of two different communicating
agents. I suspect that the problem can also arise when even minor
differences are present in the interpretations of communicated information,
because the interpretation of many concepts involve a very large number of
implications and associated inferences.

This question appears to me to be one that is worthy of a separate field
of investigation: precisely how different can ontologies be while sustaining
an adequate level of accuracy in interpreting communications that rely on
the ontologies?

My own suspicion is that the similarity **in the fundamental concepts**
has to be very close to 100%.

There is quite convincing evidence that this is NOT the case. In particular, human beings seem to communicate adequately even while the terms they use in the communication are based on different mental ontologies. This happens so often that it seems to be the norm rather than the exception; it is part of what makes it so hard for reasonable people to come to agreement on basic ontological questions (such as whether time is a dimension or not).

The reasoning is something like this: if the
brain (or a simulation) does as much computation as one of our laptops, then
it can run at least 1 million inferences per second.

There is no evidence whatever that the brain is capable of this. In fact, there near-conclusive evidence that the human brain is incapable of quite simple logical inferences without external support (such as drawing a diagram or writing a formula). In particular, people - even with logical training - consistently make systematic errors when given simple modus tollens reasoning tasks, with confidences in the 90% range.

If (crudely
calculating) the inferences supported by the differing ontologies differ by
1 in 1000 then two different ontologies will generate 1000 differing
inferences per second from the same information. How much difference can be
tolerated before something goes badly wrong - perhaps a direct logical
contradiction? My guess is that each serious "fact" that we rely on to
support our everyday activities is supported by at least 1000 assumptions,
and getting one in a thousand wrong would invalidate the meaning of these
facts, making normal actions, expecting predictable results, effectively
impossible at any level. A similarity of 99.9% in two different fundamental
ontologies may not be enough for any meaningful level of communication.

Look, this HAS to be nonsensical. If this really were true, how could human beings EVER succeed in communicating? There is no way you and I could come to a logically secure agreement on 1000 axioms even if we had months to do it in and had nothing else to do. But children learn natural language at a rate of around 20 new words per day around the ages of 4-6 years old, pretty much all by themselves.

But, as I said at the start, this is an issue that needs investigation.

We all know that people differ in assumptions and beliefs, and yet we do
manage to communicate reasonably well in most cases. How can that be?

Because the assumptions and beliefs themselves are irrelevant to the communication: all that matters is that we come to agreement on the beliefs we EXPRESS to one another.

Well, it happens probably because we **know** that we have different
assumptions and beliefs, and when communicating, only assume that there is a

certain fundamental set of knowledge in common, and only rely on that basic
set of common assumptions and beliefs to express the ideas we want to

communicate.

No, that doesn't work. First, people don't know this. Second, how is agreement on this vital common 'core' achieved?

If we 'misunderestimate' what our fellow conversant knows,
there can be and often is a miscommunication. The ability to communicate
effectively depends on the ability to guess correctly what facts,
assumptions, and beliefs are likely to be shared by those with whom we
communicate. Among specialists, of course, a lot more common technical
knowledge is assumed.

An issue that has occupied some of my attention lately has been the
question of what basic ontological concepts are sufficient to support
accurate communication. I frame the issue as being analogous to the
"defining vocabulary" used by some dictionaries as a controlled vocabulary
with which they define all of their words. For the Longman's, it is around
2000 words. The analogous question is how many fundamental ontological
elements (types/classes, and relations/functions) are needed to logically
specify the meanings of all other terms used in a reasonably complex domain
(having perhaps 100,000+ terms), to some adequate level of detail?

Define "define". If you mean logically define, then the number of defining terms is going to be comparable to the total number of terms. All 'natural kind' terms, for example, are definitionally primitive.

Most ontology languages don't even support definitions. Why are you so focused on definitions which usually don't, and often can't, exist? And which if they did exist would be largely functionally irrelevant in any case?

I don't
know, but I think that this is a question that is important enough to
warrant substantial effort. My guess is in the 6,000-10,000 concept range,
and that many of those are fundamental enough to be common to many complex
domains.

Any other guesses?

See above. My guess is that most terms we use, both the communicate with and to speak with, have no definitions and do not need them. We eliminated definitions from KIF because they caused only harm and provided no functionality.

Pat Hayes

Patrick Cassidy
MICRA, Inc.
908-561-3416
cell: 908-565-4053
cassidy@xxxxxxxxx

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx

--

---------------------------------------------------------------------
IHMC               (850)434 8903 or (650)494 3973   home
40 South Alcaniz St.       (850)202 4416   office
Pensacola                 (850)202 4440   fax
FL 32502                     (850)291 0667    cell
http://www.ihmc.us/users/phayes      phayesAT-SIGNihmc.us
http://www.flickr.com/pathayes/collections


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)

<Prev in Thread]	Current Thread	[Next in Thread>
Re: [ontolog-forum] orthogonal, (continued) Re: [ontolog-forum] orthogonal, Schiffel, Jeffrey A Re: [ontolog-forum] orthogonal, Pat Hayes Re: [ontolog-forum] orthogonal, Ed Barkmeyer Re: [ontolog-forum] orthogonal, Schiffel, Jeffrey A Re: [ontolog-forum] orthogonal, Ed Barkmeyer [ontolog-forum] Ontology similarity and accurate communication, Patrick Cassidy Re: [ontolog-forum] Ontology similarity and accurate communication, Adrian Walker Re: [ontolog-forum] Ontology similarity and accurate communication, Barker, Sean (UK) Re: [ontolog-forum] Ontology similarity and accurate communication, Patrick Cassidy Re: [ontolog-forum] Ontology similarity and accurate communication, Pat Hayes Re: [ontolog-forum] Ontology similarity and accurate communication, Patrick Cassidy <= Re: [ontolog-forum] Ontology similarity and accurate communication, Adrian Walker Re: [ontolog-forum] Ontology similarity and accurate communication, Barker, Sean (UK) Message not available Re: [ontolog-forum] Ontology similarity and accurate communication, Pat Hayes Re: [ontolog-forum] Ontology similarity and accurate communication, Patrick Cassidy Re: [ontolog-forum] Ontology similarity and accurate communication, Pat Hayes Re: [ontolog-forum] Ontology similarity and accurate communication, Patrick Cassidy Re: [ontolog-forum] Ontology similarity and accurate communication, John F. Sowa Re: [ontolog-forum] Ontology similarity and accurate communication, Patrick Cassidy Re: [ontolog-forum] Ontology similarity and accurate communication, Pat Hayes Re: [ontolog-forum] Ontology similarity and accurate communication, Duane Nickull

Previous by Date:	Re: [ontolog-forum] New subscriber, Deborah MacPherson
Next by Date:	[ontolog-forum] Deadline Extension: ESWC-08 Workshop on Knowledge Reuse and Reengineering over the Semantic Web (KRRSW 2008), vpresutti
Previous by Thread:	Re: [ontolog-forum] Ontology similarity and accurate communication, Pat Hayes
Next by Thread:	Re: [ontolog-forum] Ontology similarity and accurate communication, Adrian Walker
Indexes:	[Date] [Thread] [Top] [All Lists]