ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Fundamental questions about ontology use and reuse

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Patrick Cassidy" <pat@xxxxxxxxx>
Date: Wed, 24 Jun 2009 19:29:16 -0400
Message-id: <004b01c9f523$9721b940$c5652bc0$@com>
John,
   It is admirable for any commercial enterprise to use any advanced
technique, with or without an ontology, to solve practical problems.  But
the problems you describe that were attacked using the VivoMind system are
not the ones that a common Foundation Ontology is intended to address - they
are local problems that do not require a common ontology because they do not
require semantic interoperability among multiple independently developed
programs.  The two problems for which a common FO is especially suited are:
(1) broad general accurate semantic interoperability, supporting the
integration of multiple **independently developed applications** by
providing a means to represent information in a form accurately
interpretable by all applications, regardless of the origin of the
information; and (2) providing a means to communicate results among multiple
independently developed modules so as to allow greater reuse of research
results in AI applications, thereby increasing the efficiency of those
research projects that use the common standard of meaning.
   It is unfortunate that you keep citing projects whose results cannot be
independently evaluated.  Anecdotes are no substitute for reporting results
in a form that allows others to quickly reproduce those results and test the
functionality of the systems for tasks other than those reported.  All
scientific advance is based on being able to reproduce reported results.    (01)

  On specific points:
[JS] > 
> There are four kinds of global search, each of which has very different
> requirements:
> 
>   1. A very precise search that supports detailed reasoning.  Some
>      systems do that by aligning multiple databases in a federation
>      that enables SQL queries to be executed across all the DBs in
>      the federation.    
  Yes, and I thought it was pretty clear from my note that this is the kind
of global search I was referring to, not any of the other three.    (02)

[JS] > 
> An ontology won't reduce the cost in the slightest -- *except for*
> those databases that had previously been designed or federated
> according to that ontology.  If you want to federate the databases
> before doing the inferences, you would still need the services of
> a company such as OntologyWorks, or you would have to do the
> equivalent work by yourself (and/or by your employees).
>
   The federation of legacy databases would be greatly accelerated, and most
of the work could be done by local data managers,  using the kind of natural
language interface that I think should be developed as part of the FO
project.  But even more important is the point that you concede, that RDB's
designed using the FO as the most fundamental data model would be
interoperable with each other without further mapping, and at a development
cost no greater than that of a RDB built with an idiosyncratic local model.
This is scarcely a minor point.  As I mentioned in my note, accommodating
legacy systems is a good idea both for its intrinsic value and its
likelihood of accelerating acceptance.  But designing a new system solely
for legacy systems is unnecessary.  Looking down the line ten or twenty
years from now, most RDBs in use then will have been developed after today,
and all of them could be developed to be interoperable as soon as a common
FO is recognized as technically adequate and sufficiently widely used to
encourage adoption.  When moving forward, it is a good idea to look ahead,
and not walk facing backward.    (03)

Pat    (04)

Patrick Cassidy
MICRA, Inc.
908-561-3416
cell: 908-565-4053
cassidy@xxxxxxxxx    (05)

> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-
> bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F. Sowa
> Sent: Wednesday, June 24, 2009 12:31 PM
> To: [ontolog-forum]
> Subject: Re: [ontolog-forum] Fundamental questions about ontology use
> and reuse
> 
> Pat,
> 
> I'm responding to this note under the "Fundamental questions" thread,
> because it's more closely related to that thread.
> 
> PC> One of the possible components of the Foundation Ontology project
>  > that I think should be included is an API that serves to enable
>  > global search over any set of relational databases that have
> elements
>  > mapped to the FO.
> 
> There are four kinds of global search, each of which has very different
> requirements:
> 
>   1. A very precise search that supports detailed reasoning.  Some
>      systems do that by aligning multiple databases in a federation
>      that enables SQL queries to be executed across all the DBs in
>      the federation.  Wolfram Alpha also does that with their curated
>      and carefully structured databases, which are designed to work
>      together in a similar federation.  That federation presupposes
>      a common ontology (developed by Wolfram) with axioms that support
>      the necessary operations.
> 
>   2. The search performed by Google, Yahoo, etc. does not support
>      the reasoning that can be done with multiple federated databases.
>      What they do is information retrieval, and any system (human or
>      computer) that uses the retrieved information must examine the
>      retrieved files to extract whatever information they need.
> 
>   3. Search and computation against data that had already been
>      aligned according to some standard *terminology*.  Many web
>      sites have data (and forms for entering data) that are tagged
>      with terms such as 'first name', 'last name', 'address', etc.
>      Those terminologies have very few axioms, and they make very
>      few assumptions about the details of whatever those terms
>      refer to.  But for a great many applications, such as selling
>      books and mailing packages, people use those terms with
>      sufficient consistency that the applications work successfully.
> 
>   4. Search and inference across multiple texts written in natural
>      language and multiple databases that have not been aligned.
>      This is the kind of task that VivoMind specializes in, and
>      I'll say more about it below.
> 
> PC> This is a capability advertised by Cyc and Ontology Works (and
>  > a few others with which I am not familiar), but no one can
>  > actually try it out on any scale without spending a few hundred+
>  > kilobucks to hire the companies.
> 
> An ontology won't reduce the cost in the slightest -- *except for*
> those databases that had previously been designed or federated
> according to that ontology.  If you want to federate the databases
> before doing the inferences, you would still need the services of
> a company such as OntologyWorks, or you would have to do the
> equivalent work by yourself (and/or by your employees).
> 
> PC> Having a working ontology-based RDB integration system would,
>  > I expect, make a lot more people interested in the potential for
>  > an ontology.
> 
> That is true *only* for those databases that had previously been
> integrated, either designed from the ground up on a common ontology
> or previously federated by a company such as OntologyWorks.
> 
> If they hadn't been previously federated, you're back to the step
> of spending "a few hundred+ kilobucks" to hire OntologyWorks or
> to do the equivalent work with your own employees.
> 
> PC> I envision it also includes development of an effective
>  > Natural Language interface...
> 
> Fortunately, you don't need to develop such an interface, because
> you can point to the work by VivoMind, which already does that,
> but without using anything that resembles the kind of foundational
> ontology you are proposing.
> 
> PC> You have said on numerous occasions, and I agree, that it is
>  > important to take legacy systems into consideration to encourage
>  > adoption of a new technology.   This is one way to do it and
>  > still provide a basis for scale up to the more demanding
>  > applications that could take full advantage of the logical
>  > inferencing potential of an ontology.
> 
> This is the kind of work that we do today at VivoMind.  Before
> reading the rest of this email note, I suggest that you look
> at the results from some actually implemented systems:
> 
>     http://www.jfsowa.com/talks/pursue.pdf
> 
> All the slides are relevant, but the three applications I'll
> discuss begin on slide #14.
> 
>    *** Pause while you open that file and turn to slide #14 ***
> 
> Slide #14 summarizes the three applications.  The first two use
> older versions of VivoMind software, and the third uses some of
> our latest software (which is capable of supporting the first
> two on a much larger scale than the old software).
> 
> The first application, Educational Software, is described on
> slides #15 to #19.  Three different companies tried to do it.
> 
> The first company did something along the lines you are proposing:
> use a large ontology and deductive methods of reasoning.  That
> approach failed for reasons stated on slide #18.  The second one,
> which used a statistical method, also failed.  See slide #19.
> 
> Slides #20 to #23 describe the VivoMind approach, which worked.
> For interpreting natural language, we did *not* use a large
> general-purpose upper ontology.  Instead, we used lexical
> resources along the lines of WordNet, Roget's Thesaurus,
> and VerbNet.  As you know, those resources contain very few
> axioms or assumptions other than type-subtype and part-whole.
> 
> Slides #20 and #21 show the kinds of definitions used for
> a domain-dependent ontology about arithmetic.  Only a dozen
> or so conceptual graphs of the kind illustrated there were
> sufficient as a supplement to the lexical resources.
> 
> Slides #22 and #23 discuss how the VivoMind software used
> case-based reasoning to solve the problem.
> 
> Slides #24 to #27 discuss the legacy re-engineering problem,
> which used a small domain-dependent ontology for analyzing
> COBOL programs together with lexical resources along the
> lines mentioned above.
> 
> Slides #24 and #25 describe the problem and the VivoMind approach.
> Slide #26 shows a typical paragraph from the English documentation.
> Note the following points:
> 
>   1. The English consists of some ordinary English words that are
>      found in the lexical resources plus a lot of computer jargon
>      and named entities that are found only in this domain.
> 
>   2. Interpreting such English without a detailed ontology would be
>      impossible.  However, the first step (discussed in slide #25)
>      used an off-the-shelf grammar for COBOL and a domain ontology
>      to translate the COBOL to conceptual graphs.
> 
>   3. The domain ontology (written by Arun Majumdar) assumed one
>      concept type for each COBOL syntactic type.  Arun defined
>      additional concept and relation types to group the COBOL
>      types in more general supertypes and some conceptual graphs
>      to relate the COBOL types to English words (either from
>      WordNet or from the jargon used in the domain).
> 
>   4. Arun translated the COBOL grammar to Prolog rules, which
>      invoked the same VivoMind rules that generated CGs from
>      English.  While parsing the COBOL, the parser made a list
>      of all named entities (program names, file names, variable
>      names, and named data items) and linked them to all graphs
>      in which they were mentioned.
> 
>   5. Then the Intellitex parser used the conceptual graphs and
>      named entities derived from COBOL to interpret the English,
>      such as the example in slide #26.
> 
> Slide #27 shows the final results, which were exactly what the
> customer wanted.  A major consulting firm estimated that the
> task would require 80 person years to do by hand.  With the
> VivoMind software, it took 15 person weeks plus 3 computer weeks.
> 
> Slides #28 to #29 describe the differences between the old
> VivoMind software and the new VivoMind Language Processor (VLP),
> which we are actively developing and extending.
> 
> Slides #30 to #38 describe the application to Oil and Gas
> Exploration.  This application does the fourth kind of search
> described above:  Search and inference across multiple texts
> written in natural language and multiple databases that have
> not been previously aligned.
> 
> Like the other examples, it uses lexical resources, not a
> true ontology.  We have upgraded the resources in the past
> few years, but most of the resources were downloaded for
> free from the WWW.
> 
> For some of the resources we did some integration and
> alignment.  But for others (such as Roget's Thesaurus) we
> did not attempt to do any integration.  Instead, the agents
> dynamically do whatever alignment seems appropriate during
> the parsing.  For more information about that, see
> 
>     http://www.jfsowa.com/pubs/paradigm.pdf
> 
> The domain ontology was written by EGI (Earth and Geoscience
> Institute) with some tutoring and consulting by Arun and me.
> As a result of this work, we have developed some semi-automated
> development aids that enable a domain expert with no knowledge
> of any special knowledge representation language to write the
> domain ontology:
> 
>   1. Analysis and extraction tools that find all the words in
>      the source documents that are not already in the lexical
>      resources or in any list of named entities.
> 
>   2. A tentative ontology that forms hypotheses about how the
>      unknown terms are related to known terms and to one
>      another.
> 
>   3. The domain expert can edit the tentative ontology to
>      correct any errors and to add any additional concept
>      or relation types.
> 
>   4. Steps #1, #2, and #3 can be iterated as many times as
>      needed to improve the ontology.
> 
>   5. The domain expert(s) can used controlled English to
>      write more detailed axioms needed for inferencing.
> 
>   6. The VivoMind software checks the axioms from #5 for
>      consistency with the tentative ontology and with the
>      other resources used for interpreting the English.
> 
>   7. Steps #1 to #6 can be reiterated with additional
>      source documents until the domain experts are
>      satisfied that the VLP system is interpreting the
>      documents correctly.
> 
> We are still working on these tools to reduce the human effort
> as much as possible.  Our goal is to enable the domain experts
> to generate their own ontologies with a minimal amount of
> tutorials and consulting from VivoMind.
> 
> This approach is working very well.  It's possible that more
> general upper-level ontologies could be useful.  If so, the VLP
> system could use them.  But we don't require any such ontology
> to implement applications along the lines of the examples
> presented in those slides.
> 
> John
> 
> 
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>     (06)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (07)

<Prev in Thread] Current Thread [Next in Thread>