ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] What goes into a Lexicon?

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Hans Polzer" <hpolzer@xxxxxxxxxxx>
Date: Wed, 29 Feb 2012 22:04:50 -0500
Message-id: <008901ccf758$11624f60$3426ee20$@verizon.net>

Agree with most of your comments, even the ones about getting a Marine to control the data dictionary. I quibble a bit with the comment on Postal Code being the same as Zip Code. The reason given by the person you were talking to for them being different is clearly specious. However, in many cases Postal Code is used to handle a wide variety of postal code types implemented by different national postal systems, and not limit the acceptable code values  to those used in the US Postal Service – specific Zip Code, whether that be the 5 digit version or the 5+4 version.  This is an acceptable solution if the system/application in question doesn’t really care what the Postal Code value is – it just captures whatever the user enters and spits it back out on demand. However, if the system has to do routing or sorting of some sort or other processing that depends on decoding the specific meaning of the postal code value (like in demographic market analysis, for example), then the data dictionary should specify the different types of postal codes (including the US Zip Code) that can be entered in that data element and use a country code field or similar structure that controls which postal code frame of reference should be used to decode the meaning of the postal code value. And that assumes we won’t have postal codes for other planets and doesn’t deal with the issue of what happens if any of the postal code types change.

 

So being explicit about the frames of reference and context assumptions in the data dictionary is very important. Unfortunately, most systems are built with these assumptions being implicit and not represented in either the data dictionary itself or in the system external interface specification (directly or indirectly).

 

Hans

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of David Eddy
Sent: Wednesday, February 29, 2012 6:42 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] What goes into a Lexicon?

 

Rich -

 

On Feb 29, 2012, at 1:39 PM, Rich Cooper wrote:



But when I said “I haven’t heard of an ontology project of large size”, I meant one which used a true ontology instead of just a data dictionary, which was not a solution to the problem then either.  The DD is useful for cleaning up a project after all the problems have been solved, and the program had to be shipped to the maintenance team.  But it only added to the cost of development and didn’t solve development problems; it was considered good maintenance practice only. 

 

 

Please note, just because I describe these scenarios in the terminology of software development & programmers, if you're not a programmer or haven't experienced the programming process, please just substitute your professional experience.

 

 

 

You've either accidently or very deliberately put your finger in the bulls eye.

 

The discarded-to-the-waste-bin-of-history data dictionary* (aka metadata repository**) HAS to be woven into the development process.  As a documentation after thought, it's a waste of effort.

 

One of the functions of a data dictionary properly done is that people of whatever skill level—manager, analysts, designers, programmers, whomever—don't just make up terminology on the fly.  Some one—personally I prefer a Marine, since (a) you don't have to explain this issue to them more than once, and (b) the tend to carry some degree of authority—MUST control the language list.  

 

The real trick—which clearly most organizations have ignored—is to have the terminology validation/documentation process woven into the development process.  When/if someone invents a new term, that's not on the approved list, their activity—running a compile?—fails.  The organizational reality is that a "new" term is actually something we already have or sheer ignorance.  [I have had the conversation where someone argued that dear ol' Postal Code & Zip Code were very different things since one is letters & numbers & one is just numbers.  My perspective: if you're big enough to have a mailroom, they're functionally the same thing regardless of what they're called or what they look like..]

 

Absent this sort of automated terminology control process, we get the totally out of control terminology redundancy we have today.

 

Whether or not ontologies would help, I have no idea. 

 

 

Again... I would argue that very few organizations were able to effectively use the data dictionary process.  ONE of the key success factors in a successful data dictionary implementation is a controlled vocabulary.  If Fortune 1000 firms—the ones with the biggest, most complex, most convoluted software portfolios—have not mastered these skills how will they cope with ontologies?

 

FIRST we learn to ride a tri-cycle... eventually we learn to drive a Ferrari.

 

 

I would argue that in the context of language control, or precisely defined language/terminology, most organizations are still learning to lift their heads in the crib.

 

 

 

 

If I were describing this process for someone writing a report in the Queen's English, I'd have a special spell checking process that required the author—better yet the human editor—to explicitly state which meaning they mean for specific terms/acronyms.  And I would deliver pdf documents with "hovering help" to show the specific meaning for ambiguous terms.

 

In a properly written paper, it is correct to write "International Business Machines (IBM)" the first time used & then use "IBM" the rest of the time.  I see this violated commonly in the New York Times & Wall Street Journal.

 

 

As you dive deeper into documents intended only for organizational consumption, the level of jargon goes up & the explanations/definitions go down.  I believe "bafflegab" is the correct description.

 

 

 

 

* data dictionary - what people have experienced for a data dictionary is all over the map.   This is what I refer to:   http://www.tdan.com/view-articles/6123  (although I have substituted "metadata repository" for data dictionary in this 2007 version)

 

 

** In approximately 1989 IBM briefly introduced their long heralded AD/Cycle with RepositoryManager and changed the "data dictionary" term to "metadata dictionary."  I regard them as synonyms. While John Sowa clearly knows of AD/Cycle, most IBMers today have never heard of it.  So much for learning the lessons of history.

 

___________________

David Eddy

 

781-455-0949

 


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>