[Top] [All Lists]

Re: [ontolog-forum] Semantic Systems

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Sat, 27 Jun 2009 13:34:03 -0400
Message-id: <4A46580B.5060100@xxxxxxxxxxx>
Rich,    (01)

JFS>> ... there are semi-automated tools that extract info
 >> from unrestricted English as an aid to writing ontologies
 >> and formal specifications.    (02)

RC> ... do you have any references on those semi-automated tools?    (03)

People have been developing such tools for a long time.  An older,
but still very useful approach is NIAM (Natural Language Information
Analysis Method), which has merged with ORM (Object Role Modeling).
The NIAM methodology by Nijssen started with simple sentences that
described the kinds of entities and their relationships in English.
Following is a survey (written in 1998) of the ORM/NIAM work:    (04)

    http://www.orm.net/pdf/springer.pdf    (05)

This doesn't use fully unrestricted English, but it uses simple
sentences selected or written by somebody familiar with the
subject matter.    (06)

In my 2002 paper on Architectures for Intelligent Systems, I summarized
approaches based on controlled natural languages in Section 3.  See
that section and the references:    (07)

    http://www.jfsowa.com/pubs/arch.htm    (08)

Following is a paper from 2004, which analyzes documents to find
relationships:    (09)

    Towards Terascale Knowledge Acquisition    (010)

A more recent article by the same authors:    (011)

http://www.isi.edu/natural-language/people/hovy/papers/06dgo-SiFT-Guspin.pdf    (012)

Eduard Hovy, the third author of those papers, published a broader
survey of the field in 2005:    (013)

    Methodologies for the Reliable Construction of Ontological Knowledge    (014)

For a full list of work by Hovy and his group, see
    http://www.isi.edu/~hovy/    (015)

You can find a lot more by typing "knowledge acquisition" or
"knowledge capture" to Google.    (016)

For a summary of the more recent work we're doing at VivoMind,
I'm copying the following summary from a previous note.  The
three dots (...) indicate parts that were deleted.    (017)

John    (018)

-------- Original Message --------
Subject: Re: [ontolog-forum] Fundamental questions about ontology
use and reuse
Date: Wed, 24 Jun 2009 12:30:44 -0400
From: John F. Sowa <sowa@xxxxxxxxxxx>
To: [ontolog-forum] <ontolog-forum@xxxxxxxxxxxxxxxx>    (019)

...    (020)

PC> You have said on numerous occasions, and I agree, that it is
 > important to take legacy systems into consideration to encourage
 > adoption of a new technology.   This is one way to do it and
 > still provide a basis for scale up to the more demanding
 > applications that could take full advantage of the logical
 > inferencing potential of an ontology.    (021)

This is the kind of work that we do today at VivoMind.  Before
reading the rest of this email note, I suggest that you look
at the results from some actually implemented systems:    (022)

    http://www.jfsowa.com/talks/pursue.pdf    (023)

...    (024)

Slides #24 to #27 discuss the legacy re-engineering problem,
which used a small domain-dependent ontology for analyzing
COBOL programs together with lexical resources along the
lines mentioned above.    (025)

Slides #24 and #25 describe the problem and the VivoMind approach.
Slide #26 shows a typical paragraph from the English documentation.
Note the following points:    (026)

  1. The English consists of some ordinary English words that are
     found in the lexical resources plus a lot of computer jargon
     and named entities that are found only in this domain.    (027)

  2. Interpreting such English without a detailed ontology would be
     impossible.  However, the first step (discussed in slide #25)
     used an off-the-shelf grammar for COBOL and a domain ontology
     to translate the COBOL to conceptual graphs.    (028)

  3. The domain ontology (written by Arun Majumdar) assumed one
     concept type for each COBOL syntactic type.  Arun defined
     additional concept and relation types to group the COBOL
     types in more general supertypes and some conceptual graphs
     to relate the COBOL types to English words (either from
     WordNet or from the jargon used in the domain).    (029)

  4. Arun translated the COBOL grammar to Prolog rules, which
     invoked the same VivoMind rules that generated CGs from
     English.  While parsing the COBOL, the parser made a list
     of all named entities (program names, file names, variable
     names, and named data items) and linked them to all graphs
     in which they were mentioned.    (030)

  5. Then the Intellitex parser used the conceptual graphs and
     named entities derived from COBOL to interpret the English,
     such as the example in slide #26.    (031)

...    (032)

The domain ontology was written by EGI (Earth and Geoscience
Institute) with some tutoring and consulting by Arun and me.
As a result of this work, we have developed some semi-automated
development aids that enable a domain expert with no knowledge
of any special knowledge representation language to write the
domain ontology:    (033)

  1. Analysis and extraction tools that find all the words in
     the source documents that are not already in the lexical
     resources or in any list of named entities.    (034)

  2. A tentative ontology that forms hypotheses about how the
     unknown terms are related to known terms and to one
     another.    (035)

  3. The domain expert can edit the tentative ontology to
     correct any errors and to add any additional concept
     or relation types.    (036)

  4. Steps #1, #2, and #3 can be iterated as many times as
     needed to improve the ontology.    (037)

  5. The domain expert(s) can used controlled English to
     write more detailed axioms needed for inferencing.    (038)

  6. The VivoMind software checks the axioms from #5 for
     consistency with the tentative ontology and with the
     other resources used for interpreting the English.    (039)

  7. Steps #1 to #6 can be reiterated with additional
     source documents until the domain experts are
     satisfied that the VLP system is interpreting the
     documents correctly.    (040)

We are still working on these tools to reduce the human effort
as much as possible.  Our goal is to enable the domain experts
to generate their own ontologies with a minimal amount of
tutorials and consulting from VivoMind.    (041)

This approach is working very well.  It's possible that more
general upper-level ontologies could be useful.  If so, the VLP
system could use them.  But we don't require any such ontology
to implement applications along the lines of the examples
presented in those slides.    (042)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (043)

<Prev in Thread] Current Thread [Next in Thread>