ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Language vs Logic

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Tue, 14 Sep 2010 11:41:07 -0400
Message-id: <4C8F9793.5030503@xxxxxxxx>

Jim Rhyne wrote:
> This is OK as long as you realize that data integrity and data semantics are
> contained in the applications, that you understand these legacy systems well
> enough to be sure you understand the data semantics and that you can
> reproduce them without error. Legacy databases are often full of codes that
> are meaningless except when interpreted by the applications.
>       (01)

Strongly agree.  Reverse engineering a "legacy" (read: existing/useful) 
database can be an intensely manual process.  Analysis of the 
application code can tell you what a data element is used for and how it 
is used/interpreted.  The database schema itself can only give you a 
name, a key set, and a datatype.  OK, SQL2 allows you to add a lot of 
rules about data element relationships, and presumably the ones that are 
actually written in the schema have some conceptual basis.    (02)

Reverse engineering a database is the process of converting a data 
structure model back into the concept model that it implements.  And the 
problem is that the "forward engineering" mapping is not one to one from 
modeling _language_ to implementation _language_.  It is many-to-one, 
which means that a simple inversion rule is wrong much of the time, and 
the total effect of the simple rules on an interesting database schema 
is always to produce nonsense.  Application analysis has the advantage 
of context in each element interpretation; database schema analysis is 
exceedingly limited in that regard.    (03)

That said, other contextual knowledge can be brought to bear.  If, for 
example, you know that the database design followed some "information 
analysis method" and the database schema was then "generated" (even if 
by hand) according to the principles of that method, then you may be 
able to recognize "entity" tables and "relationship" tables and 
"attributes" and "dependencies" and "code" data types and "date" data 
types, and so on.  And if the schema rules are also written according to 
the methodology, they can help.  But a different analysis/design method 
may beget very similar structures with a different set of conventions 
for naming and relationship representation.  For example, is a foreign 
key attribute in an entity table an existential dependency or just a 
"functional" (1..1 or 0..1) relationship?  And is a subclass represented 
by a separate table, or by a code attribute (type name) if it has no 
local attributes (as distinct from local relationships)?  And there is 
always the question of what "null permitted" means.  (Remember Ted 
Codd's "kinds of nothing"?)    (04)

So, if you know the design method and believe it was used consistently 
and faithfully, you can code a reverse mapping that is complex but 
fairly reliable, but you still have to have human engineers looking over 
every detail and repairing the weird things.  Further, the human 
engineers must be familiar with the application domain, and have access 
to the business experts and some of the software engineers.  All of this 
translates to a full-blown software engineering project with some 
assistance from software analysis tools.  I'm not sure how much easier 
that is than using software analysis tools on the applications, and in 
this day and age, there is no reason not to use both sets of tools, 
especially if the providers have tool sets that work together.    (05)

-Ed    (06)

P.S.  OMG has a whole gang of software analysis tool vendors making 
standards for interchange of the analytical results, because none of 
them alone ever has the right set of capabilities for any major 
customer.  They are the Software Assurance Group and the 
"Architecture-Driven Modernization" Task Force.  (The latter is ADM, a 
play on the OMG "MDA" engineering approach, because they do _reverse_ 
engineering.)    (07)

-- 
Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694    (08)

"The opinions expressed above do not reflect consensus of NIST, 
 and have not been reviewed by any Government authority."    (09)


>
> Jim Rhyne
> Software Renovation Consulting
> Los Gatos, California
> http://www.enterprisesoftwarerenovation.com/
>
>
>
> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
> [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Rich Cooper
> Sent: Sunday, September 12, 2010 2:04 PM
> To: '[ontolog-forum] '
> Subject: Re: [ontolog-forum] Language vs Logic
>
> Hi David,
>
> You are right-on with a realistic view of how this will progress, IMHO.
>
> But instead of reverse engineering legacy systems, consider projects to
> reverse engineer legacy DATABASEs.  That is a whole lot more effective and
> way less expensive.  Also, it happens to be one subject discussed in my
> patent at http://www.englishlogickernel.com/Patent-7-209-923-B1.PDF, which I
> mentioned earlier.
>
> By reverse engineering the database, you can still use whatever remains
> useful of the old data model, the AsIs version.  It helps define what the
> users were actually typing into those fields, just in case the new design
> team wants to know how the users viewed each field, and some of the timing
> and volume measurements can be helpful in estimating performance for the new
> database, the ToBe version.
>
> Still more information can be reconstructed by analysis of the domains
> actually represented in the data, where often surprising correlations are
> found.  The old Information Flow Framework showed some insightful ways to
> look at actual domain sample vectors, or at least this interpretER saw it
> that way.
>
> -Rich
>
> Sincerely,
> Rich Cooper
> EnglishLogicKernel.com
> Rich AT EnglishLogicKernel DOT com
> 9 4 9 \ 5 2 5 - 5 7 1 2
>
> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
> [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of David Eddy
> Sent: Sunday, September 12, 2010 1:41 PM
> To: [ontolog-forum]
> Subject: Re: [ontolog-forum] Language vs Logic
>
> Pat -
>
> On Sep 12, 2010, at 12:19 AM, Patrick Cassidy wrote:
>
> Context for the group... & reminding Pat, since he's probably
> forgotten he said this...
>
> I am holding Pat to his statement at a SICoP meeting in approx 2005
> where he said (approximately) that unless "this magic" (e.g.
> ontologies, etc.) was somehow delivered & made accessible to folks in
> the trenches who have zero knowledge, interest or education in
> ontologies, ontologies would be nothing more than an interesting
> academic exercise.
>
>
>
> CONTEXT... I am interested in the potential use of ontology for the
> development/maintenance of software applications.
>
> I am increasingly coming to the conclusion that ontologies are simply
> NOT relevant to this task.
>
> Please tell me I'm using the wrong lance to tilt at the wrong
> windmill.  It won't hurt my feelings.
>
>
>
>   
>> Figuring out precisely what a term in an ontology is supposed to
>> mean has
>> three aspects: what the person developing the ontology intends it
>> to mean;
>> what the person reading the documentation interprets it to mean,
>> and what
>> the computer executing a program using the ontology interprets it
>> to mean.
>> Ideally, the they will be the same, but they may differ.
>>     
>
> I would argue that since these are highly likely to be three
> different people, with all the differing experiences, perspectives &
> languages that humans tote around as life baggage, "they WILL differ"
> not may.
>
> Granted my interest in systems development & maintenance may be too
> narrow, I would also argue there are far more people wrestling with
> systems development/maintenance language challenges than people
> building ontologies.
>
>
>
>   
>> so good documentation is critical for
>> ontologies intended to be used by more than a small tightly
>> connected group.
>>     
>
> My money is on the ONLY accurate documentation is the source code
> (assuming, of course, you can find the correct version).  In
> commercial applications, what paper documentation exists may have
> been accurate at one point, but if the system has been in use the
> code is the only accurate record.  [I'd like to think weapons systems
> & nuclear power plants hold to a higher standard, but I have no
> experience here.]
>
> This is in fact one of the great language challenges... as a system
> transitions from paper specifications & documentation through
> development into production and on to new teams of project managers &
> developers (whose native language is likely NOT English), the intent
> of the original language begins to mutate since there is no formal
> process to ensure subsequent generations of maintainers (project
> managers & coders) continue to use the same language & meanings.
>
> Whereas the compiler will force you to use correct VERBS, there is no
> such constraint on the NOUNS... which is why organizations end up
> with literally hundreds of names/nouns for the same thing.
>
> The CD/CDE (as abbreviation for CODE) example is from just such an
> experience.  The original IMS DBA enforced CD as the single correct
> abbreviation for several years in the initial system building phase.
> She left & a new DBA took over.  A new segment was added & he
> evidently liked to abbreviate CODE as CDE.  There was no automated
> mechanism like a compiler to ensure or "encourage" him to use CD
> rather than CDE.  The problem comes when one searches for "-CD
> " (note the space suffix, since CD was used as a suffix in data
> element names) you will NEVER find "-CDE ".  The devil is in the
> details.
>
> In a system that adheres to "good names" one learns that the name of
> something & what it is are in fact the same.  In the physical world
> there a multiple forces-the dairy, the food inspectors, the grocery
> store-to ensure a jug labeled "milk" actually contains milk.  We
> haven't quite learned this lesson yet in systems.
>
>
>
>   
>> For me, good documentation means to state what one intends the
>> ontology element to mean,
>>     
>
> The way you present this I interpret as saying the ontology needs to
> be done BEFORE the system.
>
> This is, of course, not acceptable since the vast majority of systems
> are up & running & have been built/maintained without any
> consideration at all to an ontology(s).
>
> I don't consider reverse engineering ontologies from existing systems
> to be practical.  Primary argument... since the system owner does not
> consider it cost effective to maintain accurate, current
> documentation, they're certainly not going to spend money/time on
> reverse engineering an ontology.  I also factor in that the "reality"
> I look at is a small organization of 10,000 people, with 900
> systems.  Last year ComputerWorld said IBM, with 400,000 people had
> 4,500 "applications" (same as/different from systems? ...who knows).
>
> I am at pains to point out that each one of these
> "applications" (whatever an application is) was built by different
> people at different times for different objectives.  Then maintained
> by different people... all these actors bringing different language
> to the task.
>
>
>
>   
>> To some extent,
>> learning to use a logic-based ontology is similar to learning to
>> use a new
>> object-oriented programming language, but programming languages
>> usually come
>> with a library of immediately useful applications as learning
>> examples.  We
>> haven't reached that point yet in the technology of ontology
>> creation and
>> dissemination.
>>     
>
>
> Long, long ago I was beginning to work on my last programming
> assignment.  I angered the architect (not a word in use then) by
> telling him I did not want to LEARN CICS (at that point a HOT
> language), rather I wanted to USE it.  Took about 10 years, but he
> finally came around to understanding what I was saying.  His
> templates (what we'd call frameworks today) were absolutely
> brilliant.  From a standing start (e.g. knowing absolutely nothing
> about CICS) I was able to take his templates & get 17 CICS programs
> working in 2 weeks.
>
> Twenty years later I was looking at a cross platform development
> tool... and was astonished to find a template/framework tool for
> $350.  The earlier templates probably cost the client $500,000+.
>
> This is the standard I hold an ontology tool to... it better not be
> any more complex than a spell checker/dictionary.  Clearly there's a
> ways to go.
>
>
>
>   
>>  For the time being, I look first at the logical axioms associated
>> with a
>> term in an ontology, then at the documentation (usually contained
>> in the
>> "comments' section of an ontology element)
>>     
>
> You keep using words that are difficult to grok...
> "documentation"?    "comments"?  They are outside my experience.   :-)
>
>
> Here's what I consider to be documentation...
>
> a = b * c
>
> Totally accurate & not very useful.  More precisely... USELESS!
> Unfortunately there's a lot of this.
>
> The same logical statement:
>
> wkly-pay = hrs-wkd * rate-pay
>
> Is now potentially comprehensible.
>
> If I can determine this is code is in a payroll module then I'm going
> to assume that "pay" is likely a dollar & cents amount.  If I can
> deduce this from just the name without needed to ask someone or fish
> around in some questionable documentation, then I'm a happy camper.
>
> But what I would really like is the ability to hot-key/right click on
> these variables & see what they mean.  I think this look-up facility
> is possible in modern editors like Eclipse... but someone has to dig
> up what the words mean in the context of their specific use... which
> may or may not say anything about their meaning somewhere else in the
> system.
>
> ___________________
> David Eddy
> deddy@xxxxxxxxxxxxx
>
> 781-455-0949
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>
>       (010)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (011)

<Prev in Thread] Current Thread [Next in Thread>