ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Next steps in using ontologies as standards

To: <edbark@xxxxxxxx>, "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Patrick Cassidy" <pat@xxxxxxxxx>
Date: Wed, 7 Jan 2009 20:23:24 -0500
Message-id: <038401c9712f$b567cc50$203764f0$@com>
Ed,
   Lots of comments and questions, thanks for the detail.  I will only
respond to a few:
(1) > The question, Pat, is: How is what you are proposing different from
what the Cyc and SUMO and IFF folks did and _are doing_?  What new insight
do you bring?
   The new insight is that focusing on the semantic primitives makes every
part of the task easier.  But that is not the only difference.
   (1a).  The development process I am suggesting is different -
representatives of at least 50 different groups will create a FO that
contains enough of the semantic primitives so that information represented
in one system can be accurately encoded in any other system used by the
participants.  Thus we do not have only one perspective, as in Cyc or SUMO,
but a variety of perspectives, with redundant representations that can be
translated into each other, or into the representations of information used
in relational databases.  This process will create a community of diverse
users (the developers) with a stake in seeing the work extended and widely
adopted.  The other projects were essentially one-group efforts (though some
external input for SUMO was solicited and received, the result was still
almost totally a product of a few people, with no alternative
representations.
   Starting with a larger array of users, with an ontology known *not* to be
idiosyncratically structured by a single group, is a very big difference.
  (1b)   The FO is not intended to be used as a whole for any one project,
but to have only those parts needed for specific applications taken out and
used.   There will be no loss in computational efficiency due to the
redundancy in the FO, since only a non-redundant subset need be used in any
given application.  The alternative logically consistent representations
enable different groups to use their own preferred representations without
losing the ability to translate into the other representations.  Neither Cyc
nor SUMO have this capability.
   The NIEM (National Information Exchange Model) has a utility to extract
parts to create highly specific "Information Exchange Packets".  Some
utility will also be needed to extract parts of the FO, and parts can be
tagged as being components of specific subsets.
  (1c)  By focusing on representing mostly the semantic primitives (other
non-controversial representations might be included, to make use easier)
this reduces the need for agreement to the minimum required to perform the
task of translating among alternative representations.
  (1d) Cyc is not now and never has been completely open source, and is not
maintained by an open group of contributors.  There is little incentive for
volunteers to add to the Cyc system for the benefit of a private company.
SUMO has no public ongoing development effort.  Some other open-source
projects have been quite successful in gathering valuable effort from a wide
community.  This one may or may not be.  When the FO is viewed as a resource
to enable communication among many different communities, rather than as
just a starting point to develop local applications, the open-source method
is particularly apt and valuable.  The focus on enabling translation among
different representations is a very big difference from the philosophy that
drove the creation of Cyc or SUMO.
   (1e)  Neither OpenCyc nor SUMO have representations of all of the terms
in the Longman defining vocabulary.  In the COSMO I am adding in such
representations, to get a simple version of a primitives-based FO that can
be used for experiments to test the hypothesis that there is in fact only a
limited number (under 10,000) of semantic primitives required for the
ontology translation task.  As with the Longman vocabulary, it is likely to
turn out that in the first version of an FO there will be ontology elements
not primitive and not needed for the translation task.  These can be
retained or moved to an extension, at the discretion of the committee
maintaining the FO.
   (1f) The last phase of FO development should include a strong focus on
developing a natural-language interface to the ontology, to assist users in
finding or creating elements they require for applications.  This interface
should also be available for testing and critique on the web, from its
earliest functional versions.  By allowing users to communicate with the FO
system using the controlled vocabulary of the Longman (possibly extended
with terms defined by use of the Longman terms), the complexity of the NLU
task will be dramatically reduced.  This part of the project is the most
speculative, because it is not known just how much easier NLU will be when
the vocabulary is limited almost exclusively to the semantic primitives.
But it has a better chance of succeeding, I believe, than NL tasks that
require processing of unrestricted text.  Cyc's NL interface is not
available for simple testing via a Web interface, but I have been informed
that it does not use a vocabulary as restricted as the Longman defining
vocabulary.    (01)

(2) [EB] > 
> It should be possible to translate the published Cyc and SUMO
> ontologies into CLIF or OWL/Full, and perhaps that would be a
> good start.  But then
> we come to what real reasoning capabilities and behaviors are required
> and whether any available tools have them.  So the translation might
> just produce a different dead-end.  But IMHO, translating a work that
> has the experience of some usage and considerable evolution is far
> better than starting in a green field.
>
   All of the parts of OpenCyc and SUMO that can be used freely (without the
GPL virus) can be included in the common FO, along with anything anyone else
wants that (a) is semantically primitive, or is included in an extension;
and (b) makes technical sense and is logically consistent with other
components.  As you mentioned before, the FOL reasoner that will accompany
the FO will have to be adapted to whatever syntax is chosen.    (02)

(3) [PC] > > (4) Developing an application - even database integration via
>> ontology - is much more costly than developing a foundation ontology that
can
>> handle a local problem.
> 
[EB] > First, I have seen no evidence for this.  Second, I simply don't
> believe it.    (03)

   Well, this assumes that the OpenCyc and SUMO+MILO are close enough
approximations to an FO to serve as lessons learned, and I haven't seen any
applications that uses them to impressive effect.  Perhaps there are some
hidden under proprietary covers.  From this I conclude that building
applications must be a lot more costly than building the ontology.  There is
a second reason: I think that the first "killer app" will be a good NLU
system, perhaps deriving from the core based on the semantic primitives that
might be developed as a part of the FO project.  In trying to imagine how
such an ontology-driven NLU system can be developed, the most plausible
method I can think of is very labor-intensive, and more expensive than the
ontology project I am suggesting.  Existing NL systems that use an ontology
do not appear to have a capability anywhere close to human, from which I
also conclude that a lot of effort must be required.  But I would be happy
to be proven wrong.  Can you point ne to some impressive application using
an ontology (real reasoning to solve a practical problem, not just data in,
data out) that did not take a lot more effort than building the ontology?
Tell me what is genuinely impressive to you, I won't carp.    (04)


(4) [EB] > 
> This is more on the same topic.  I agree.  But my point is that the FO
> problem is not just getting reasoners to support consistent semantics
> for CLIF; it is choosing the axiom set that will give results (at all)
> in a particular problem space with a particular conformant reasoner.
> Every AI student is conversant with a slew of tuning tricks.  Some are
> external to the ontology (steering variables and the like); but some
> are
> internal (helper axioms, concept splitting, and reworking to avoid
> certain constructs and interactions).  The internal tuning tricks that
> modify the ontology produce a "technically different" ontology, even
> though the conceptual intent is equivalent.  And if you "split" and
> introduce an intermediate concept, the ontologies are clearly no longer
> formally equivalent -- one of them has an extra relation.
>
  Interesting historical detail, beyond my own experience.  It sounds like
such issues need to be discussed at the earliest point of planning.  Perhaps
the reasoner will have to be part of the system, to minimize those kinds of
second-order effects.    (05)

(5) [EB] > I'm a bit out of my element here, but it is my impression that
> requiring
> definition of every concept using only Basic English and other terms
> previously so defined will at some point encounter disambiguation
> problems.  In primitive natural languages, words extend over time to
> have multiple related meanings, and particular combinations are
> canonized as the designations for particular specialized concepts, _in
> addition to_ their interpretation as circumlocutions.  And that is one
> of the traps in language parsing, as you and Chris demonstrated with
> "expressive power".
>
    I'm not sure I understand this point.  The problem of variation of word
meaning over time does not directly impact the structure of the FO, unless
the word meaning variation is caused by new primitive concepts entering the
language.  The mapping of linguistic words to ontology elements is a task
for those creating NL programs that use the ontology.  Perhaps you are
concerned with having some grounding for meaning so that it "bottoms out" on
something other than more symbols.  In principle, we can include procedural
attachments that allow the system, with some sensor and robotic
capabilities, to verify the reality of some assertions - something like
Woods' "procedural semantics".  But for the first version I think it is only
necessary to adopt the stance of Nirenburg - the "meaning" of an ontology
element is its representation in the ontology, and its outputs verify for us
that it has an adequate level of "understanding".  As long as the ontology
does what we want it to do, we don't have to demand a deeper level of
understanding comparable to the human, based on sensorimotor experience.
But it is an intensely interesting question.    (06)


(6) > > [[PC-5]] I don't think that $30 million over 3 years qualifies as
>> "massive" in comparison with the costs of lack of semantic
interoperability
>> (100 billion per year), or to the costs (government and private) of prior
>> and ongoing efforts to address the semantic interoperability problem by
>> less effective methods.
> 
[EB] > Yeah.  My management trots out these statistics for every budget
cycle,
> too.  You can save that BS for the funding sources.
>
   Do you think that the inefficiency costs of lack of semantic
interoperability are trivial? Below $50 million per year?    (07)

Pat    (08)


Patrick Cassidy
MICRA, Inc.
908-561-3416
cell: 908-565-4053
cassidy@xxxxxxxxx    (09)


> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-
> bounces@xxxxxxxxxxxxxxxx] On Behalf Of Ed Barkmeyer
> Sent: Wednesday, January 07, 2009 4:29 PM
> To: [ontolog-forum]
> Subject: Re: [ontolog-forum] Next steps in using ontologies as
> standards
> 
> 
> I wrote:
> 
> >> With my standards experience hat on, I would say that 'nothing
> remotely
> >> resembling a "market"' after 6 years translates to academic
> shelfware.
> >> If no one is using it, one of the following must be true:
> >>   - it does not effectively support any practice that is (currently)
> >> perceived to need support; or
> >>   - it is not being used by the people and tools engaged in the
> >> practice it was intended to support.
> 
> Pat Cassidy wrote:
> 
> > [[PC-1]] there are other possibilities that I think are actually the
> > problem.
> > (1) people are exploring the use of these ontologies, and actually
> using
> > them internally, but nothing dramatic has been thus far released for
> public
> > inspection, and probably nothing of general interest has been
> developed in
> > spite of ongoing efforts. See point (2)
> 
> Well, then the work has a market.  It has customers.  There is no
> requirement for customers to publish.  And most ontologies, like most
> database schemas, are private.
> 
> > (2) because a foundation ontology is as complex as the basic
> vocabulary of a
> > human language, it is time-consuming to learn how to use to best
> effect, and
> > without an existing dramatic application, efforts to use it are
> hesitant and
> > starved for funds.
> 
> But this is true of any technology with a high cost of entry.
> Commercial organizations will need to see a pretty strong business case.
> If we train young engineers to use the technology, and it is supported
> by reliable tooling, that lowers the entry cost substantially.  And
> lower cost of entry translates to more rapid adoption.
> 
> > (3) the lack of broad agreement on which of the foundation ontology
> > candidates will become the widely used standard inhibits commitment
> of
> > significant funding to any one of them.   No one wants to risk going
> down a
> > dead-end.
> 
> This presumes that there are lots of these candidates in existence.
> I know of two, and their relationship to standard languages and tooling
> support very definitely inhibits the wide use of either one.  In their
> current forms, they are already dead-ends.
> 
> It should be possible to translate the published Cyc and SUMO
> ontologies
> into CLIF or OWL/Full, and perhaps that would be a good start.  But
> then
> we come to what real reasoning capabilities and behaviors are required
> and whether any available tools have them.  So the translation might
> just produce a different dead-end.  But IMHO, translating a work that
> has the experience of some usage and considerable evolution is far
> better than starting in a green field.
> 
> > (4) Developing an application - even database integration via
> ontology - is
> > much more costly than developing a foundation ontology that can
> handle a
> > local problem.
> 
> First, I have seen no evidence for this.  Second, I simply don't
> believe
> it.  The cost of developing a would-be "foundation ontology" by
> committee has proved to be quite high by comparison with the use of
> similar resources to build a well-defined application, even database
> integration via ontology.  (Been there, done that, have the t-shirt,
> and
> the scars.)
> 
> > In
> > the government agencies where I worked the first question asked is
> "is this
> > technique being used in a program of record?" to which the answer was,
> when
> > I heard it asked a couple of year ago "no, not to my knowledge". The
> > response then is - "we aren't doing research here, and we aren't
> going to be
> > the first to break in a new technology".
> 
> Yeah. That's called "mission awareness", and it isn't restricted to
> government agencies.  And it is why we have grant programs and
> consortia
> that support pilot projects.  To get the funding, you have to stop
> talking about abstract "foundation ontologies" and talk up a pilot
> application.
> 
> > [[PC-2]] The choice of some reasoner to use with an FO may well be
> very
> > important, as you suggest.  The FO should be in a Common Logic
> conformant
> > language, and should be susceptible to interpretation by a FOL
> reasoner like
> > Vampire or Prover9, or the Ontology Works system.  But there are
> variations
> > in the way certain structures are handled (such as forall-exists
> axioms) and
> > these variations may have serious effect on the results.  I would
> expect
> > that a project to create a common FO would include a component to
> choose and
> > tune a reasoner to work well with the FO.  A some point third-party
> vendors
> > may develop better reasoners that use the same FO, and that would be
> all to
> > the better, but I agree that it will be important to make sure that
> some
> > reasoner works well with the FO.
> 
> I think we are in total agreement on this.  But based on recent NIST
> experience with Prover9 (erstwhile Otter), I worry that the particular
> choices for axiomatic formalization of a concept may determine whether
> a
> given reasoner will or will not produce desired results.  "Tuning"
> might
> require a replacement axiom set for one or more foundational concepts.
> And that, in turn, relates to which other supporting concepts are
> introduced, and so on.  So the question then becomes:  What percentage
> of the original ontology _formulation_ is actually reusable?  What is
> it
> we are actually going to standardize?
> 
> > [[PC-3]] Well, in the absence of positronics I think that some
> variant of an
> > FOL reasoner will have to be used, but that variant, I agree, needs
> to be
> > thoroughly tested, and perhaps modified, to accommodate the intended
> > interpretations of syntactical structures in the FO.  As an
> example, ...
> 
> This is more on the same topic.  I agree.  But my point is that the FO
> problem is not just getting reasoners to support consistent semantics
> for CLIF; it is choosing the axiom set that will give results (at all)
> in a particular problem space with a particular conformant reasoner.
> Every AI student is conversant with a slew of tuning tricks.  Some are
> external to the ontology (steering variables and the like); but some
> are
> internal (helper axioms, concept splitting, and reworking to avoid
> certain constructs and interactions).  The internal tuning tricks that
> modify the ontology produce a "technically different" ontology, even
> though the conceptual intent is equivalent.  And if you "split" and
> introduce an intermediate concept, the ontologies are clearly no longer
> formally equivalent -- one of them has an extra relation.
> 
> > [[PC-4]] IT is not "ignorance" to observe past experience and
> recognize
> > where false analogies can be misleading.
> 
> Agree.  I have often quoted Mark Twain: ""The trick is to glean from an
> experience exactly the knowledge that is contained in it.  A cat which
> sits down on a hot stove will never do it again, but it will never sit
> on a cold stove again either."
> 
> The problem is to be wise enough to do that.  I don't claim to be, and
> this industry certainly isn't.
> 
> > The best past experience for
> > basing a foundation ontology comes from use of limited defining
> vocabularies
> > in some dictionaries.  Because the words used in the dictionary
> definitions
> > are labels for concepts, it is reasonable to infer that there is also
> a
> > limited set of defining concepts (and ontological representations of
> those
> > concepts) that would allow ontological description of an unlimited
> number of
> > terms, concepts, or real-world entities in many domains.
> 
> I'm a bit out of my element here, but it is my impression that
> requiring
> definition of every concept using only Basic English and other terms
> previously so defined will at some point encounter disambiguation
> problems.  In primitive natural languages, words extend over time to
> have multiple related meanings, and particular combinations are
> canonized as the designations for particular specialized concepts, _in
> addition to_ their interpretation as circumlocutions.  And that is one
> of the traps in language parsing, as you and Chris demonstrated with
> "expressive power".
> 
> So, yes, you can get an infinite number of constructs from a finite
> grammar and a finite set of terms, but that doesn't mean you can get
> arbitrarily subtle semantics from any such set of constructs.  A
> "circle" is not "a polygon with a very large number of very short
> sides".
> 
> What I think this means is that every domain will introduce new
> relations that are axiomatically defined in terms of each other as well
> as in terms of the FO relations.  From a dictionary point of view,
> these
> new concepts may have "circular definitions".  From an axiomatic point
> of view, they are just mutually defined.  (I think Chris made this
> point
> much better.)  The RoI question is: At what point does this set of
> introduced domain terms make the FO essentially irrelevant?
> 
> Let's talk about an actual FO and a set of target problem spaces.  (La
> prova e nel gusto.)
> 
> In a recent ontology for a very small problem space, there were about
> 300 relations.  It could have used 20+ Cyc concepts and their axioms.
> How valuable is the FO to that domain ontology?  Well, it is nice to
> have the concepts and the "part/whole" axioms for physical and logical
> collections, except that you have to sift through two dozen such
> concepts to find the axiom set you mean in each case.  And you still
> have to define the specializations, in order to constrain at least one
> of the roles.  If I were more conversant with the Cyc ontology, I
> probably would have known which collector I wanted in each case, but
> that is after my domain analysis tells me what axioms I need.  Now,
> once
> I had determined what I meant, I found a Cyc concept had all the needed
> axioms, including ones I would have forgotten.  OTOH, I had to reject
> two similar concepts because they included axioms I didn't mean.
> Without Cyc, I defined (most of) the same general concepts, but it took
> me several tries to get all the axioms right.
> 
> And saying that a Shipment is a Cyc:group doesn't really provide much
> in
> the way of common semantic basis for comparison with other ontologies.
> It took a while to get the domain experts to agree on the rules for a
> Shipment.  So we can be reasonably sure that other domain teams will
> produce slightly different axiom sets for closely related concepts.
> And
> in fact, the subsequent project in the same community modified the
> model.  That is why I wonder about the utility of the FO.  When it
> takes
> a group of 10 automotive materials managers a day or two to agree on
> one
> model of Shipment, ShipmentUnit and Container, how many different
> Shipment ontologies are we going to see?  And who would consider any of
> them "fundamental"?
> 
> > But the experience
> > of past standards efforts are relevant for the foundation ontology
> only at a
> > stage after there is agreement on the inventory of primitive concepts
> and
> > ontological representations thereof that will suffice to form the
> > "conceptual defining vocabulary".
> 
> Yes.  If that set of materials managers were producing an industry
> standard, we could definitely use that ontology as part of that
> standard.  But the industry will require our group to call our concept
> eKanbanShipment and require that we put another 150 people in a room to
> agree on the "fundamental" model of Shipment (in 5 years).
> 
> IMO, there is value to putting the ontology in the standard, but it is
> not so clear how "fundamental" any such ontology will ever be.  And my
> standards experience is that any concept that is _required_ to be
> widely
> used in other standards will have as few axioms as possible.  It will
> be
> as general and as semantically empty as they can make it, in some cases
> beyond the point of all utility, so that no group's use of it will have
> an axiom they don't want.  Look at the definition of "product" in the
> STEP Product Data Exchange standards for a truly useless classifier.
> 
> > [[PC-5]] I don't think that $30 million over 3 years qualifies as
> "massive"
> > in comparison with the costs of lack of semantic interoperability
> (100
> > billion per year), or to the costs (government and private) of prior
> and
> > ongoing efforts to address the semantic interoperability problem by
> less
> > effective methods.
> 
> Yeah.  My management trots out these statistics for every budget cycle,
> too.  You can save that BS for the funding sources.
> 
> The question, Pat, is: How is what you are proposing different from
> what
> the Cyc and SUMO and IFF folks did and _are doing_?  What new insight
> do
> you bring?
> 
> Instead of trying to build FOs, could we just put ontologies in the
> data
> exchange standards, beside the XML schemas and UML models?  It is a lot
> easier to do; it is a lot clearer what its immediate value might be --
> clarifying, and forcing clarification of, fuzzy textual descriptions,
> reasoning about the exchanged data, direct support for semantic
> mediation, wider experience in ontology development, education of the
> XML-literate, etc.  And if any of these ontologies happens to become an
> FO by wide adoption, wow, a diamond in the coal mine!
> 
> -Ed
> 
> --
> Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
> National Institute of Standards & Technology
> Manufacturing Systems Integration Division
> 100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
> Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694
> 
> "The opinions expressed above do not reflect consensus of NIST,
>   and have not been reviewed by any Government authority."
> 
> 
> --
> Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
> National Institute of Standards & Technology
> Manufacturing Systems Integration Division
> 100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
> Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694
> 
> "The opinions expressed above do not reflect consensus of NIST,
>   and have not been reviewed by any Government authority."
> 
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>     (010)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (011)

<Prev in Thread] Current Thread [Next in Thread>