[Top] [All Lists]

Re: [ontolog-forum] Solving the information federation problem

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Cc: Arun Majumdar <arun@xxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Thu, 03 Nov 2011 02:07:16 -0400
Message-id: <4EB22F94.8040801@xxxxxxxxxxx>
Leo,    (01)

Arun would be happy to go through the details with you at any time.    (02)

> I simply find this hard to believe, unless you already had very
> elaborate software.    (03)

It most definitely is elaborate.  Section 6 (slides 81-102) covers the
VivoMind Analogy Engine (VAE):  http://www.jfsowa.com/talks/goal.pdf
You'd be hard pressed to find anybody else with such software.    (04)

> Just understanding (analyzing) the domain as a human would takeyou
> 2 weeks, in my estimate.  Not spending the time to understand it,
> but just applying your tools, means you will do it wrong.    (05)

By humans, it would take much, much longer.  Note slide 104, which
says that there was 1.5 million lines of COBOL plus 100 megabytes
of English documentation.  Note that the consulting firm estimated
40 people for 2 years (80 person years) to analyze all that and
to generate a cross-reference of the programs to the documentation.    (06)

What Arun did was to take an off-the-shelf grammar for COBOL and
modify the back end to generate conceptual graphs that represent
the COBOL data definitions, file definitions, and code.    (07)

Note slide 105:    (08)

> An extremely difficult and still unsolved problem:
> ● Translate English specifications to executable programs.
> Much easier task:
> ● Translate the COBOL programs to conceptual graphs.
> ● Use the conceptual graphs from COBOL to interpret the English.
> ● Use VAE to compare the graphs derived from COBOL to the
>   graphs derived from English.
> ● Record the similarities and discrepancies.
> The graphs derived from COBOL provide a formal semantics
> for the informal English.    (09)

As VLP (VivoMind Language Processor) parses the English, it uses the
VivoMind Analogy Engine (VAE) to find conceptual graphs from COBOL
that match something in the sentences.  Any sentence that doesn't
match from COBOL anything is discarded as irrelevant.  But if there
is a match, VAE uses the graphs to resolve ambiguities and to fill
in any missing, but implicit details.    (010)

The COBOL graphs are assumed to be accurate, and the English is
assumed to be an informal approximation to the formal COBOL.
But the English may contain additional commentary, which is
irrelevant for the purpose of generating cross references and
looking for discrepancies.  For some examples of the kind of
discrepancies that VAE found, see slides 109 to 111.    (011)

> I'd like to see the metrics resulting from this project, if possible.    (012)

I don't know what you mean by metrics.  Arun can show you the CD-ROM
that contained the results of the analysis.    (013)

Slide 108 shows what the client wanted the consulting firm to produce:    (014)

> Glossary, data dictionary, data flow diagrams, process architecture,
> system context diagrams.    (015)

Some of that output was generated by off-the-shelf tools that could
analyze COBOL code to generate such information.    (016)

But there were no tools that could do a cross-reference of the
documentation and the programs and check for discrepancies.
That was what VAE did.    (017)

VLP with some heuristics was used to produce a glossary for human
readers.  For example, if a certain phrase X was found in a pattern
such as "An X is a ...", that was considered a candidate for a
definition of X.  It was added to the glossary with a pointer
to the source.  Arun and André proofread the glossary to toss
out any sentences that weren't useful definitions.    (018)

The client said that the CD-ROM contained exactly what they wanted
the consulting firm to do.    (019)

That's a pretty good metric.  And it's a metric that satisfied
a living, breathing customer that was delighted to pay for
15 person weeks of work instead of 80 person years.  That's
a lot more convincing than the kinds of numbers typically
generated for toy problems.    (020)

John    (021)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (022)

<Prev in Thread] Current Thread [Next in Thread>