ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Foundation ontology, CYC, and Mapping

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Wed, 17 Feb 2010 11:41:41 -0500
Message-id: <4B7C1C45.3080008@xxxxxxxxxxx>
Dear Matthew, Doug, and Pat C,    (01)

Before getting to the details of your notes, I'd like to clarify the
term FO, which Pat introduced for a Foundation Ontology.  I have no
objection to using the term FO, but I would apply it to the totality
of the theories (or microtheories) in an OOR.  The Cyc example of an
upper ontology with many specialized microtheories would be an example.    (02)

However, I would broaden the idea to support multiple theories at the
upper levels, which might be incompatible.  For example, Matthew,
Chris P, and Pat H have strongly supported a 4D ontology for the upper
levels, but many other people prefer to use a 3D upper level.  For
many of the lower level microtheories, the differences between a 3D
vs 4D foundation are irrelevant.    (03)

Pat C wants to use primitives based on Longman's dictionaries, but
I objected that those terms are too vague, and an enormous amount
of work would be needed to make them precise.  Therefore, I propose
the following approach:    (04)

  1. OpenCyc is freely available as open source, but Pat objected
     that it has very few axioms.  However, Longman's terms have zero
     axioms, and they have never been tested for their usefulness in
     any ontology, certainly not in anything as large as Cyc.    (05)

  2. Therefore, I recommend that we adopt the OpenCyc terms together
     with the axioms available in OpenCyc as a "starter set" for
     developing the Foundation Ontology.    (06)

  3. But we would also welcome and encourage anybody with different
     preferences to contribute more terms and theories to the FO.    (07)

  4. When convenient, those new theories should adopt the same
     spelling as the terms already in the starter set, but any
     new terms (taken from Longman's or any other source anyone
     might prefer) could also be used in those additional theories.    (08)

  5. The complete hierarchy of theories for the FO would be
     organized along the lines we've been discussing in these notes.    (09)

  6. Pat's goal of finding a much smaller set of defining terms
     (called primitives) could be done in parallel with the
     development of the FO instead of delaying its development.    (010)

  7. If and when a good set of primitives has been found useful,
     a methodology based on them could be promoted for simplifying
     further ontology development.  However, older theories that
     use the older terms would remain available as long as anyone
     needs them.    (011)

I'm suggesting this as a compromise that would begin with a large
tested set of terms already organized in a generalization/
specialization hierarchy.  It would also accommodate new terms
that could be added as needed from any source.  But if anyone
wants to do any immediate implementation today, the OpenCyc
terms can be used.  Any work done with them would be guaranteed
to be supported by the full FO.    (012)

MW> I think [the FO] is more than just a vocabulary, but I agree
 > that great care would need to be taken not to introduce into what
 > I am calling abstract theories axioms that did not contradict say
 > the 3D and 4D axioms that would be introduced when they were
 > combined with those theories.    (013)

I agree.  In fact, one advantage of OpenCyc is that it doesn't have
all the axioms (AKA old baggage) of full Cyc.  That implies that it
is more general than Cyc and less likely to have unpleasant
"surprises" (AKA inconsistencies).    (014)

MW> However, having such [abstract theories] would greatly simplify
 > mapping between say 3D and 4D ontologies. These would I think need
 > to be carefully designed rather than just picking something up,
 > since it is casually almost certainly made some upper ontology
 > assumptions.    (015)

I agree.  Many, if not most of those theories could be based on
terms that are already in OpenCyc.  In fact, if you just adopt
the terms by themselves without any axioms, there is no possibility
of an inconsistency.  Then any axiom that is added could be tested
for consistency with both the 3D and the 4D ontologies (along the
lines that I suggested in my previous note).    (016)

DF> An ontology with a single theory would complicate the idea of
 > a foundational ontology as an interlingua to which all external
 > ontologies can be mapped.    (017)

Yes.  That is why I would prefer to use the name Foundation Ontology
for the full hierarchy of all the theories in the OOR.    (018)

DF> Competing external theories could be incorporated by defining new
 > concepts/relations for the relata of the competing theories (which
 > map directly to the terms of the source ontology) and then adding
 > rules relating them with the ostensible "sameAs" terms already in
 > the foundational ontology.    (019)

PC> Yes, but I would express the process as "logically specifying
 > the terms of each extension ontology using only the terms in the
 > FO" rather than "map directly" since that suggests that the
 > entities in the extension ontology are already on the FO, which
 > in general they will not be.    (020)

I would expect the FO to be completely open-ended so that the
number of useful terms would increase indefinitely.  There could
also be a small subset of recommended defining terms.  (I wouldn't
even object to calling them "primitives".)    (021)

But we do have to recognize that many "primitives" may be very
underspecified.  For example, the term PointInTime should be neutral
with respect to a 3D or 4D ontology. For lower-level microtheories,
you might have a general theory called Hiking, which would use
PointInTime without any dependencies on 4D or 3D.  But if there are
dependencies, one could have common specializations of Hiking with
either of the two upper-level theories to generate the subtheories
Hiking4D and Hiking3D.    (022)

JFS> But if you think of [the FO] as a collection of theories organized
 >> in a hierarchy, no single theory ever changes.  Instead, each
 >> innovation adds another theory to the hierarchy, which may be a
 >> generalization, a specialization, a sibling, or a cousin of some
 >> other theory.  You  can also compare and combine theories.    (023)

DF> Agreed.  This is part of the reasoning behind Cyc method of
 > "microtheories" or contexts....    (024)

PC> And that would be true for the FO and its extensions also. But there
 > is an additional aspect to the FO as I have proposed it.  The CYC
 > BaseKB is part of every other more specialized microtheory, but it
 > was not designed as, nor used as, an inventory of basic elements that
 > is sufficient to specify (as combinations) the intended meanings of
 > the symbols that are in all of the other linked microtheories.  CYC
 > didn't try that tactic, which could have been very informative.  But
 > a proper test would in any case require that a good number of separate
 > groups with different applications and viewpoints try to use the same
 > FO to describe their different domain ontologies.    (025)

This comment mixes two different goals:    (026)

  1. A large useful ontology that people can begin using ASAP.    (027)

  2. A project to determine whether a small subset of terms (called
     primitives) is sufficient to define everything else and would
     thereby promote interoperability.    (028)

Different people may have different priorities.  My personal preference
is to emphasize #1.  I have no objection to anyone who prefers #2, but
I would not want to tell people who have a day job that they have to
wait until #2 is finished.  My recommendation is to start with #1 and
let anyone who wants #2 to *extract* some subset from #1 in order to
test that hypothesis.  If that hypothesis seems to be justified, then
the results could be developed into a methodology for using, adapting,
and streamlining the much larger resources of #1.    (029)

But I would never suggest that people who need an ontology today should
wait until project #2 is completed.  The above proposal lets people
start new projects using OpenCyc and rest assured that the FO would
continue to support them in the future.    (030)

JFS> You can think of context as some additional statements S that are
 >> added to a theory T to specialize it for some particular application.    (031)

DF> If the context referred to is what Cyc calls a DataMicrotheory,
 > then the statements added are qualitatively different from those
 > in the basic theory.
 >
 > A context might close T's open world assumption, such that T2 has
 > a closed world assumption.  T and T2, in such a case, would be
 > different types of theory.    (032)

That's a good point.  It illustrates an important advantage of
starting with OpenCyc (or some subset of it).  We can take advantage
of the 26 years of experience in developing Cyc.  We don't have to
adopt every one of their decisions, but when we diverge, we should
have a good reason for doing so.    (033)

DF> An FO would need to have a reasonably restrictive generalization
 > of classes included in any microtheory that is to be mapped to it.
 > Defining "reasonably restrictive" could be hard, but it seems to me
 > that SUMO (with extensions) and Cyc both would currently meet this
 > requirement.  Including concepts from UMLS and GoodRelations would
 > lower the rough lower edge of the directed acyclic graph of classes
 > in several key areas.    (034)

I'm all in favor of building on good work that has been done in other
systems.  We should make sure that the licensing terms are compatible
and get explicit permission to use whatever is incorporated in the FO.    (035)

DF> One question is on what basis should individuals should be included
 > in an FO.  Certainly units of measure should be.  Currencies &
 > countries surely.  Cities and every instance in the GeoNames base?
 > How should the selection of people to add be made?  Organizations?
 > Conceptual works (books, movies, songs, albums, paintings,
 > constitutions, poems, ...)?  Sports and games? Events (disasters,
 > wars, elections, mergers, ...)?  Etc.    (036)

Those are excellent questions.  I would prefer to err on the side of
being more inclusive.  My suggestion would be to keep the axioms
relatively free of individual names, but to have an associated
database that would store as much as anyone might find useful.    (037)

However, there will undoubtedly be many very useful specialized
theories, such as US IRS Tax Code for 2010.  The most qualified
people to develop such a theory would be the IRS.  But the FO
could maintain pointers to such theories stored and maintained
in compatible formats by other organizations.    (038)

DF> The breadth of coverage of the proposed FO needs to clarified.
 > Is it to be an ever-expanding set of all terms defined in any
 > ontology?   Should it include all individuals ever defined on
 > the Semantic Web?    (039)

Those are important policy decisions.  Since the SW is expanding
very rapidly, we can't hope to incorporate it into the FO.  But we
must have interfaces to it that would allow any application to
access it as needed.  In fact, if we do a good job on the FO,
the SemWebbers might take notice and adapt their technology to
facilitate sharing in both directions.    (040)

DF> Or could there be a basic, relatively fixed, FO to which an
 > expanding number of contextually restricted, but still centralized,
 > ontologies are related?  I could see such for brand-name products,
 > GeoNames, UMLS, IMDB, GeneBase, etc.    (041)

PC> The FO itself should try to include all primitives that are used
 > by more than a small set of specific domain ontologies, and only
 > those non-primitive elements that are needed for ease of use and are
 > non-controversial.  Primitives required by domain ontologies should
 > also be maintained, but as part of a domain extension.    (042)

The goal of getting a useful FO ASAP implies that we should start
with a much larger set of terms than Pat has in mind.  But that may
be an advantage.  The goal of extracting a smaller number of defining
terms can be guided by usage patterns.  Those terms that are most
widely used would be prime candidates.  The other terms could be
redefined in terms of them.    (043)

I would expect an FO committee to be similar (in some ways) to the
W3C.  It would define formats, guidelines, policies, etc., and
encourage other groups to adopt them and make their work compatible.    (044)

John    (045)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (046)

<Prev in Thread] Current Thread [Next in Thread>