ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Solving the information federation problem

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Cc: "simf-rfp@xxxxxxx" <simf-rfp@xxxxxxx>, "William M. Ulrich" <wmmulrich@xxxxxxxxxxx>, "jamsden@xxxxxxxxxx" <jamsden@xxxxxxxxxx>, Paul Brown <pbrown@xxxxxxxxx>, Nikolai Mansurov <nick@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Thu, 03 Nov 2011 10:51:47 -0400
Message-id: <4EB2AA83.9080208@xxxxxxxx>
I agree with Leo that really doing this is a much harder problem than 
just analyzing the code and data structures.  It makes the assumption 
that the data structure and code nomenclature, for example, is 
consistent across multiple programs (which may be the case if there were 
stringent coding rules enforced for initial development and all 
subsequent modification), and that the nomenclature is accurate to each 
usage, as distinct from 'tailored to each usage'.    (01)

Further, code analysis software is well-advanced, so much so as to have 
many commercial products and several standards.  The usual problem for 
the vendor of software analysis tools of the 1990s was:  convert 
programs in language X to programs in language Y.  That means you have 
to have analytical capability for X, e.g., COBOL, VBasic, FORTRAN, Ada, 
Pascal, and some level of 'semantic conversion' to Y: C/C++, C#, Java, 
Ruby and next year's first-round draft choice.  And only sometimes do 
you have exactly the required support for both.  So the quickest way to 
get the customer is to team with a competitor, preferably using a 
standard for the output of the analysis tool that is input to the 
generation tool.    (02)

OMG, for example, has had a working group in this area for 10 years.  
OSF had such a group in 1990, and I assume there are others.   The OMG 
work has produced two supporting standards. The Abstract Syntax Tree 
Metamodel is a standard representation of the parse --  the program 
content at the programming language level, using standard terms for  
concepts that are common across languages, and specialized terms for the 
concepts that aren't.  The Knowledge Definition Metamodel is a higher 
level 'semantic model' of the program information -- what the 
information units are and are called, and how the program uses them, in 
detail and in summary (akin to database transactional analysis -- read, 
modify, create, delete).  "What the information units are" is a 
combination of technical information and the formal in-program and 
perhaps external documentation.  (The KDM is not 'semantics' in the 
sense of interpreted natural language.)    (03)

Finally, I would observe that the current thrust in code analysis is 
program verification.  There is commercial software that returns us to 
the glorious days of yesteryear when 'formal proof of programs' was a 
common academic pursuit.  The objective these days is to verify 
consistency between some formalized description of user intent for 
domain information/behavior and the technical behaviors of the 
software.  At OMG, this is the domain of the relatively new Systems 
Assurance Task Force, which has competitors in other standards 
organizations (in part because of the SOX legislation, and in part 
because there is government money available on both sides of the 
Atlantic).  At this point, the standards work is restricted to the 
established capabilities of techniques developed by earlier research.    (04)

So what John describes is in fact readily abetted by commercially 
available software products.  Two OMG colleagues with a handle on the 
state of the art leap to mind:  Bill Ulrich (TSG) and Nick Mansurov (KDM 
Analytics), and an organization called softwaremodernization.com.    (05)

-Ed    (06)

Obrst, Leo J. wrote:
> John, 
>
> I simply find this hard to believe, unless you already had very elaborate 
>software. Just understanding (analyzing) the domain as a human would take you 
>2 weeks, in my estimate. Not spending the time to understand it, but just 
>applying your tools, means you will do it wrong. I'd like to see the metrics 
>resulting from this project, if possible.
>
> I don't believe in magic, but I do believe that smart programmers can 
>seemingly do magic. Knowing Arun, he's capable of mini-magic, but to this 
>extent? I don't know.
>
> Thanks,
> Leo
>
> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx 
>[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F. Sowa
> Sent: Wednesday, November 02, 2011 12:42 AM
> To: Paul Brown
> Cc: jamsden@xxxxxxxxxx; [ontolog-forum]; simf-rfp@xxxxxxx
> Subject: Re: [ontolog-forum] Solving the information federation problem
>
> On 11/1/2011 8:04 AM, Paul Brown wrote:
>   
>> For this ontology (language) to be useful, both parties need to relate these
>> terms to their respective ontologies for running their businesses. Today few
>> (if any) parties explicitly craft these ontologies - they are implicit. Yet
>> there is much implied in the data structures that they exchange as there is
>> in the data structures that are used within each party's IT systems. I see
>> some benefit in analyzing these data structures and extracting the 
>ontological
>> fragments they represent.
>>     
>
> I certainly agree that much of the ontology is implicit in the data 
> structures.  But I would add that it's possible to relate those
> data structures to other languages.
>
> In particular, I'd like to mention a successful application to
> legacy re-engineering.  It involved analyzing COBOL programs that
> were up to 40 years old and had gone through many modifications
> and updates over the years.  The company also had documentation
> of various kinds -- reports, manuals, notes, emails, etc.
>
> The problem was to analyze both the programs and the documentation
> relate them to each other, detect errors and inconsistencies, and
> generate a glossary, data dictionary, etc.
>
> A major consulting firm estimated that the project would require
> 40 people for 2 years to read all the documentation and relate
> it to the software and the databases.  But with suitable computer
> analysis, the project was completed in 15 person weeks.
>
> See slides 104 to 112 of the following slides:
>
>     http://www.jfsowa.com/talks/goal.pdf
>
> (Just type 104 into the page window of the Adobe reader to jump
> to slide 104.)
>
> John
>
>  
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/ 
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>  
>  
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/ 
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>  
>       (07)

-- 
Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                Cel: +1 240-672-5800    (08)

"The opinions expressed above do not reflect consensus of NIST, 
 and have not been reviewed by any Government authority."    (09)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (010)

<Prev in Thread] Current Thread [Next in Thread>