Pat Hayes raised several issues I think are worth further discussion,
but I would like to focus on 2 of them:
(1) Hasn't building a common foundation ontology already been tried?
And
(2) What would be in a common foundation ontology? (01)
One point PH made I can't dispute: the funding agencies have a very
narrow focus, at least with respect to ontologies. To give them more
confidence in the field and help them broaden their focus it would help
if we were able to achieve some agreement on some hunks of content, not
just logical formalism. (02)
Concerning the need for a common foundation ontology:
[PC]
> >
> >Perhaps that will not
> >suffice, but it is a strategy that has not been tried and should be.
[PH]
>
> It has been tried, I would say. CYC and IEEE-SUMO are both attempts
> at a super-overarching ontology. (03)
These are in fact perfect illustrations of (what we now can see is) the
wrong approach to get *wide agreement* (regardless of technical merit),
and are not examples of what I would consider a properly funded and
coordinated project. In both cases, the ontologies were built by a
small group (Ian and Teknowledge tried to get input, but got only a
little and used only part in the hurry to meet their deadline), and
were presented to the world as a fait accompli, without any example
applications (real ones, not facilities to query the ontology). The
difference is, that to get a large enough community to actually try out
and use a common foundation ontology, a significant sample of the
potential users have to have a hand in building it, so they can also
make sure that everything they need is in there, and build the
demonstration applications that will convince others that there is
something in there worth the considerable effort of learning how to use
it. Cyc was also hobbled initially by being a closed commercial
project. (04)
As I recall from the ANSI X3/T2 meetings, most of us knew by 1997
that no standard was likely to be widely adopted until a substantial
group was funded to build it and test it. Building the utilities and
applications would take up most of the cost of the project, but it has
to be developed with the understanding that it is getting agreement on
some common foundation ontology that is the goal, and the applications
are in support of that, at the initial point. The volunteer efforts
you cite do indeed demonstrate that volunteer effort will not work for
this task. Building the foundation ontology, along with coordinated
utilities and some applications, is an engineering task that needs
central direction, goals, and timetables, and cannot wait for consensus
from people who have little time to devote to the project (and personal
axes to grind). Differences have to be resolved rapidly, by vote or
some other procedure that is agreed to ahead of time by those who are
willing to participate. (05)
There is plenty that can be discussed on this topic, but it is
important not to perpetrate the misperception that a properly organized
engineering project of a wide sampling of the ontology community ever
was funded in a way that would provide them with adequate time (and a
properly organized process) for serious input. The largest projects
thus far as best I know were funded in the range of about 1.5 million,
which was enough only for several person-years full-time equivalent,
spread over all the participants. Useful things can be done with such
funding, as the SUMO and IKRIS projects demonstrate. It just isn't
enough to get substantial participation from the wide sampling of
community that will (1) create an expectation that the product will in
fact be widely used; and (2) fund initial efforts to actually use the
product in applications, to demonstrate how it can be used. In the
case of IKL, for example, there should have been further funding of an
implementation that has a reasonably easy interface. The number of
participants (> 20% time) now would probably have to be well over 50,
and of course one cannot expect consensus on all aspects of the
project. Engineering projects need not be (and I imagine never are)
organized with a requirement that consensus of a large number of
participants is needed. The product has to be usable enough, and
demonstrably usable, so that people will want to use it, even if it
doesn't have everything they want in it. The key is to motivate people
to want the project to succeed, provide them with the resources, and
create a demonstrably usable product. Paying them will help, and
creating the expectation that it will succeed will help, and will in
turn be helped by a properly organized project. (06)
Therefore, I would reiterate: there never has been a project to build a
foundation ontology that involved a diverse representative group of
ontology builders and users, that was organized specifically to produce
a common foundation ontology with demo applications, and was adequately
funded to have a good chance to succeed. Yet the potential benefits of
getting such an ontology are vastly larger than the costs. (07)
Since we started this discussion two days ago, over 500 million dollars
have been lost nationwide in inefficiency due to lack of semantic
interoperability. I have a sense of urgency about the practical need
for a solution soon. (08)
(2) What would be in the foundation ontology? (09)
[PC]
> >- and the 'foundation ontology' is only the set of ontology
> >elements (types, relations, axioms) required to permit the needed
> >specification of the meanings of the domain elements that people
will
> >wish to formalize
>
[PH]
> But what could possibly be in such an ontology? Consider a Cyc-like
> effort, aimed at 'common sense'. Now suppose your domain of interest
> is navigating a robot submarine through deep ocean water. You will
> need many purely spatial concepts, but I will lay very good odds that (010)
> no more than one or two of those in the Cyc-style Kbase will be the
> slightest use. You will have to re-think what it means to 'rotate',
> for example, or what 'outside' means. Most of the everyday spatial
> concepts will simply be useless when swimming in a deep ocean. (This
> example actually came up when I was working on the Cyc project, and
> we checked.) Or suppose your domain is describing how viruses get
> inside human cells by tricking their membrane chemistry: will the
> notion of 'boundary' you are using really be similar to the notion in (011)
> the foundation? If the latter is based on mathematical ideas it won't (012)
> be, because the membrane, at this scale, has an elaborate 3-d
> structure. How about 'solid'? Sorry, at this scale the membrane is
> solid-ish perpendicular to its immediate tangent plane, but more like (013)
> a liquid in that plane, so its vesicles can float freely around the
> membrane. Its an odd kind of thing that doesn't exist at the everyday (014)
> common-sense scale at all. Will that be in your foundation ontology?
> Everywhere you look at real domains, one finds new kinds of 'thing'
> that you wouldn't have thought of unless you were working in that
> domain. (The Horatio principle: there are more things in heaven and
> earth than are dreamt of in your ontology.) This is basically why
> these universal ontologies never succeed: nobody has a rich enough
> imagination to think of all the things that might be needed, which is (015)
> another way of saying all the counterexamples to any philosophically
> motivated distinction.
> (016)
[PC]
Yes, there are tens of millions of categories of things and attributes
and relations that people will want to talk about in various contexts,
and most of those will not yet be in Cyc, but the question is whether
there is a much smaller set of basic concepts whose meanings are quite
well defined, and which serve to provide the basis for combinatorial
description of all the more complex concepts. If we agreed on that
basic set of concepts (and called it the 'foundation ontology' or any
other term) that would allow all of the wonderful diversity of
ontology-building that we all agree can lead to great things, but it
will also provide us all with a good mechanism to relate these
ontologies to each other, by specifying the meanings of those terms and
concepts (to whatever degree of detail is needed), using the same set
of basic terms from the common foundation ontology. Any required
fundamental concepts found to be missing from the foundation ontology
will be added when they are discovered. (017)
There is already an existence proof of just that mechanism, in the
"controlled defining vocabularies" used by some dictionaries to define
their terms; with about 2000 defining terms, Longman's can provide
meaningful definitions of over 100,000 dictionary terms. Since those
words are labels for concepts, it makes sense to imagine that a
comparably small "conceptual defining vocabulary" could be used to
logically specify the intended meanings of the hundreds of thousands of
complex concepts that people will want to work with in a computer. But
we don't yet know the necessary size of the corresponding conceptual
defining vocabulary (doubtless larger, since many of the defining words
had multiple senses), and I am not aware that Cyc or any other
ontology-builder has intentionally set out to create that conceptual
defining vocabulary for that specific purpose. I saw one paper by
Porter who looked into a similar tactic on a small scale, but does not
seem to have followed up. Perhaps someone here is aware of other
efforts specifically in that direction? (018)
A critical point here is that it is imprudent to just assume that
something can't be done when the benefits of doing it can be great. It
should be given a proper chance to succeed. The mostly volunteer
processes used up to now have not worked, but there is still a
plausible method not yet tried. (019)
I imagine that the minimum number of basic concepts needed will be at
least 5000 (types + relations), possibly 10,000? I am starting to
explore the question by creating a merger of some of the basic elements
of Cyc and SUMO (alas, first in OWL, for a couple of practical reasons,
but necessarily either into FOL or supplemented with rules in some
form, soon). The starting point will be about 3500 classes and 300
relations, and the first question will be how much that will have to
expand to be able to specify the meanings of the 2000 words used as the
defining vocabulary in Longmans. Beyond that, there are some
additional standard vocabularies, and a test could inquire how well
those meanings can be specified. That should provide some useful data;
if an asymptote on new basic terms is approached, this will suggest the
practical feasibility of trying to get agreement on using some basic
set of defining concepts. This does not in any way restrict people
from doing anything they want any way they want to - it merely provides
a means for them to relate what they are doing to what others are
doing. The goal is similar to what I interpret as your goal with
wide-open module usage on the semantic web, but by having one firm
grounding of agreed meaning in a coherent and logically consistent unit
(which can incorporate any modules that find a constituency), I think
it will make that goal a lot easier to attain. Logically compatible
but different-looking representations of the kind you found
translations for in the IKRIS project can be incorporated into such a
structure. (020)
Those who worked on the Cyc project probably will have some useful
experience to share, whether or not the Cyc baseKB was intended to be
viewed as a "defining vocabulary". In any case I hope that members of
this community will find time to look through the nascent "conceptual
defining vocabulary" (I will suggest it as the starting COSMO ontology
to the COSMO working group) to point out problems or missing elements.
It would be a lot easier with some funding, but first I need to post
the starting ontology and provide some data on how well it serves as a
base. Being OWL, at this point it is little more than a taxonomy. I
hope to have the first OWL version online at the ONTACWG site by the
time of the Ontology Summit. (021)
[PH]
>
> The way to stop the debates is not to legislate one side as the
> winner (that just changes debate into open warfare) but to allow
> everyone to write their ontologies in the way they find congenial
> (informed by a basic knowledge of good engineering practice, of
> course), and to achieve inter-operation by re-use and translation. (022)
Right. I just think that a module the size of the conceptual defining
vocabulary - which could have separable submodules - will ultimately
prove more amenable to rapidly approaching the kind of reuse that will
work - modules that are logically consistent with each other; and if
they aren't, a way to show that they aren't, so proper precautions can
be taken. (023)
I don't expect that a distributed-base volunteer project will reach a
usable set of modules any where near as fast as a funded project would.
Of course, if there is no significant community of ontologists pushing
for such a project, it will probably never get funded. (024)
Pat (025)
Patrick Cassidy
MITRE
260 Industrial Way West
Eatontown NJ 07724
Eatontown: 732-578-6340
Cell: 908-565-4053
pcassidy@xxxxxxxxx (026)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (027)
|