[Top] [All Lists]

Re: [ontolog-forum] Foundation ontology, CYC, and Mapping

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Sun, 31 Jan 2010 10:44:57 -0500
Message-id: <4B65A579.4000602@xxxxxxxxxxx>
Pat and Ronald,    (01)

The multiple word senses and microsenses are not the result of using
a natural language.  They result from the changeability of the world
itself.  If we lived in a static culture with no contact with other
cultures and no innovations of any kind, natural languages would
tend to have just one sense per word.    (02)

In fact, Lithuanian is the modern Indo-European language that has
changed the least since ancient times. The reason for its stability
is that Lithuania is close to the ancient Indo-European homeland,
it is far away from the major trade routes from China to the
Mediterranean, it was surrounded by people who had similar culture,
and there were very few immigrants from other areas who ever learned
or tried to learn Lithuanian.  Even in modern times, people who moved
to Lithuania continued to speak their native languages (i.e., Russian
or Yiddish) for many generations.    (03)

PC> I don't disagree with the way John characterizes the use of
 > human languages, but the issue that a common foundation ontology
 > addresses is to have a reliably stable set of *logical* terms
 > whose meaning is fixed.    (04)

That goal of fixed, unambiguous word senses is possible in a
frozen programming language with a single, unchangeable compiler
or interpreter.    (05)

IBM had a good term for such systems:  'functionally stabilized'.
That term was only applied to systems that had been discontinued
and for which IBM was doing "end of life" maintenance for the
few remaining installations.  It was a euphemism for dead end.    (06)

RS> Making and maintaining the meanings of signs used by people
 > will always depend upon human judgment and will always resist
 > mechanical standardisation.  Hence we should live with that fact
 > and put more effort into improving the intelligence of human
 > communities...    (07)

I strongly agree.  But I would qualify the following point:    (08)

RS> ... and less into creating a generalised mechanical form
 > of artificial intelligence while still building limited scale,
 > pseudo-intelligent machines.    (09)

The major qualification is that people have to communicate with
and by means of their machines.  That means that we need to design
tools and languages that will facilitate communication.  That is
the primary focus of the following talk:    (010)

    Controlled natural languages for semantic systems    (011)

It's also important to design tools that can extract information
from documents that were written by humans to communicate with
other humans.  That's the theme of the following talk:    (012)

    Two paradigms are better than one,
    and multiple paradigms are even better    (013)

The point of this talk is that it's possible to design systems with
multiple heterogeneous agents that use different algorithms and
different ways of reasoning, computing, and language processing.
And, I would argue, that is the best way to design flexible systems
that can bridge the gap between human methods of reasoning and the
more tightly controlled computational systems.    (014)

RS> Nevertheless, there does exist a very stable structure across
 > cultures and time arising from 'ontological dependency'.  We
 > arrived at this relationship on the basis of an ontology (= an
 > understanding about the nature of reality) that takes into account
 > how people construct the world they perceive.  One thing depends
 > ontologically upon others when it can exist only during their
 > coexistence.  (For a little more see parts of the two papers on
 > http://www.rstamper.co.uk .)    (015)

There's a lot of good material in those papers and the remainder
of your note, but it's too much to digest in just one email note.
I would love to see Pat and Ronald collaborate to determine how
Ronald's "ontological dependencies" could be reconciled with Pat's
"foundational ontology" in some realistic goals for future systems.    (016)

PC> If programmers want their programs to interoperate via the FO,
 > they would have to change the mapping for any new term or term of
 > changed meaning if it is used in communicating with other systems
 > via the FO.    (017)

Programmers have been designing and implementing interoperable
systems for over 40 years, and I seriously doubt that they would
change their methodologies just because somebody gave them an FO.
Ronald has worked on and thought about such issues for a long time,
and I'd like to see how his approach could be reconciled with Pat's
notion of an FO.    (018)

PC> I reiterate:  the point of the FO project is (1) to *agree on*
 > (not just to create) a common FO; (2) to create multiple
 > independently developed useful demo programs that use an ontology
 > mapped to the FO; and (3) to demonstrate communication among
 > those programs by means of the FO.    (019)

I'm delighted that you emphasized the phrase *agree on* because
that is the major stumbling block.  Ten years have elapsed since
the SUO email list was launched with the goal of producing an
IEEE standard for ontologies.  Fifteen years have elapsed since
the ANSI working group on ontology was started with many of
the usual suspects who still contribute and lurk on this email
list.  And almost twenty years have elapsed since since the start
of the Shared Reusable Knowledge Base (SRKB) project, which had
very similar goals.    (020)

Those projects promoted, stimulated, or influenced some useful
components:  KIF, KQML, Interlingua, UMLS, Common Logic, SUMO,
Dolce, and even my _Knowledge Representation_ book.  But the
sobering realization is the absence of *agreement* on anything
that would remotely resemble a Foundational Ontology.    (021)

Also during those twenty years, Cyc grew from a small prototype
to the largest formal ontology in the world.  To a large extent,
Cyc influenced and was influenced by those developments.  Cyc
has produced something that resembles Pat's FO more closely
than anything else in existence.    (022)

Some large corporations and government agencies had full access
to all of Cyc, but *nobody* in those institutions was able to
do anything useful with it.  In fact, I spoke with the manager
of one of the research groups that had access to Cyc, and I asked
him what they had been doing with it.  He replied:    (023)

    Several people had been involved in using Cyc for their research
    projects over the past five years.  Every one of them has been
    fired, and I don't believe that is a coincidence.    (024)

I also visited Microsoft Research, where two of my former IBM
colleagues had built the first grammar checker for MS Word (to
produce the notorious green squiggles).  I asked them what MSFT
was doing with Cyc, and they said that nobody had found anything
useful to do with it.  For processing natural language, they were
getting much more useful information by analyzing the traditional
dictionaries written for human consumption.  See the MindNet project:    (025)

    http://research.microsoft.com/en-us/projects/mindnet/    (026)

PC> I have pointed to the use of the Longman DV as *presumptive
 > evidence* (I never suggested that it was proof) that a comparable
 > small inventory of ontological concepts would serve analogously
 > to specify the meanings of many other concepts.    (027)

WordNet, MindNet, and other projects that have actually been used
in NLP systems do not assume such a base.  At VivoMind, we have also
been developing tools to extract a proto-ontology from a collection
of documents.  As a starter set, we have found large collections,
such as WordNet and Roget's Thesaurus, to be far more useful than
small collections such as Longman's.  See the following article:    (028)

    http://www.jfsowa.com/pubs/paradigm.pdf    (029)

The logical issues with WordNet have not been a serious problem
for us, because we refine the derived proto-ontology by automated
and semi-automated means.  Any inconsistencies in the starting
sets tend to get cleaned up by the refining process.  (That was
also the experience of the MindNet project, although they used
different methods of derivation and refinement.)    (030)

Conclusions:    (031)

  1. Formally defined, deeply axiomatized ontologies are valuable
     for small, highly specialized microtheories.    (032)

  2. For large, broad coverage of any significant amount of language,
     a great deal of useful information can be derived *automatically*
     from dictionaries, encyclopedias (including Wikipedia), and other
     documents written for human consumption.    (033)

  3. Attempts to define large ontologies such as Cyc by human experts
     have largely been a waste of time and money.  If they are already
     available, they can be useful.  But methods #1 and #2 have been
     far more successful.    (034)

John    (035)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (036)

<Prev in Thread] Current Thread [Next in Thread>