[Top] [All Lists]

Re: [ontolog-forum] Ontologist Aptitude Test?

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>
From: David Eddy <deddy@xxxxxxxxxxxxx>
Date: Fri, 15 Jan 2010 21:02:16 -0500
Message-id: <04060F08-1FC5-4A87-A5E5-BC612D0BDEA6@xxxxxxxxxxxxx>
Joel -    (01)

On Jan 4, 2010, at 5:22 PM, Joel Bender wrote:    (02)

>> For this (mythical?) ontology beast:
>> - it exists in some form so accessible/useful that people who  
>> know  (and probably care) nothing about ontologies will find it &  
>> use it with minimal training... some motivated users going so far  
>> as to helping to correct & extend it
>> - the ontology(s) become self sustaining
> A laudable goal for an open ontology repository.  Drawing on an  
> analogy with Wikipedia, what questions would you ask someone who  
> wants the authority to edit a page?  What slices of the OBoK would  
> you expect editors/moderators to have?    (03)

I most vociferously object to your promiscuous use of "repository."    (04)

[RANT=ON]    (05)

I will assume that you do not mean: repository = database, which is  
clearly how many people use "repository" these days.  They've been  
using "database" for a long time & "repository" sounds more  
sophisticated.  Bletch!!!!    (06)

First the definitional, contextual details.    (07)

"data dictionary = metadata repository" in my professional  
experience.  I acknowledge that many people refer to a list of data  
elements as a "data dictionary."  That is NOT how I use the term  
"data dictionary."    (08)

When I formally present on this topic I lead with a slide that says:   
"Definition: Repository = burial place"    (09)

The Oxford American Dictionary offers:  "a place, building, or  
receptacle where things are or may be stored : a deep repository for  
nuclear waste."   Notice it offers nothing in terms of getting  
something back OUT of said repository.    (010)

The greater context here is (software) information systems  
development & maintenance in large (Fortune 500 scale) organizations  
in the US, Canada & UK.  Such organizations all have huge, sprawling  
software portfolios that are constantly evolving (mutating?) mix of  
packages, custom written & acquired via buying other companies.   
These are the only organizations that will have the recognized need &  
resources to chase the ontology rabbit.    (011)

We've all worked for and experienced such organizations.  A  
distinguishing characteristic is they're S-L-O-W to change.  A major  
reason they're so slow is that no one understands how information  
flows through & across the systems with a net, entirely predictable  
result that it takes a loooooong time to make even small changes.    
As an example: there was an article in ComputerWorld towards the end  
of 2009 that said IBM has 4,000 (4,500?) "applications."  No  
definition was offered for "application."  Personally I assume (with  
absolutely no direct supporting evidence) that an application is  
comprised of a collection of (probably) many systems.    (012)

Data dictionary products have been with us since the dawn of database  
engines in the early 1970s.  Their intent was to provide accurate  
documentation on how the gazillions of interconnected pieces in a  
software portfolio are related to each other.  If organizations had  
accurate, complete data dictionaries in place in the 1990s, Y2K would  
have been a total non-event.    (013)

The harsh market reality is that data dictionaries over their 35+  
life span in the US market have had (at best) a 5% success rate.    (014)

As I'm sure John Sowa can attest, after the fiasco of FS (Future  
System), IBM had a last run at the "enterprise metadata  
repository" (again, it's the same thing as a data dictionary) with  
the disastrous AD/Cycle effort in the late 1980s.  At the center of  
the AD/Cycle effort was RepositoryManager (affectionately called  
RepoMan), comprised of some 1,700 DB2 tables.  Truly gave new meaning  
to "putting out the lights."    (015)

A very short overview of what it takes to have a successful data  
dictionary effort is here...   http://www.tdan.com/view-articles/ 
6123   Notice that this organization has a controlled vocabulary of  
about 1,500 terms.  If I had a death wish I could ask the Marine who  
put this together if he was thinking about an ontology when he did  
this, but I think I'll refrain.    (016)

There are multiple external issues that must be addressed (e.g.  
AUTOMATED) to make a data dictionary self sustaining.  I have had  
discussions with organizations that thought they had a successful  
effort yet were totally puzzled as to why the dictionary needed to be  
integrated into the software configuration management process.  The  
key person leaves & the whole thing collapses after 20 years of success.    (017)

My point here is to object to your far too casual use of the word  
"repository."  There are currently no such capable tools on the  
market.  And even if such "repository" tools existed, organizations  
wouldn't know how to weave them into their process.    (018)

A major hurdle being (crude example): when someone out in the  
organizational boonies, discovers a "new" term, the process has to  
push back very rapidly... "I'm sorry Dave, we don't think Postal Code  
is a new thing... please examine this collection of Zip Code  
artifacts..."  If it takes a week to respond, the whole effort is a  
waste.    (019)

None of this background begins to touch on the semantic/ontology  
issue.  No data dictionary products attempt(ed) to address the  
semantic challenge.  The tool always assumed the human knew what they  
were doing.   When the semantic/language/vocabulary/ontology issue is  
ignored, people will not be able to find what they need and they will  
quickly & for all time write off said dictionary/repository as a  
waste of time.    (020)

Things are actually worse now that we have Google.  Type in a word  
(maybe two if you dare push the bleeding edge?) & you get 10,000,000  
hits to choose from.  This approach does not work inside an  
organization.  First you have to KNOW what you're looking for.  If  
you don't already know that SSN is a likely acronym for "social  
security number" you're simply out of luck.  The fact that SSN also  
means many other things than "social security number" is just another  
wrinkle.    (021)

So... what I really wanted to say is, please do not just blithely  
assume that once we've found/built/created/hallucinated our ontology,  
we can just throw it in one of these mythical "repository" things.    (022)

[RANT=OFF]    (023)

re: going the Wikipedia route for the OBoK...    (024)

I would start with the 90-9-1 assumption... in an online community of  
practice, 90% of people silently lurk (read only), 9% contribute  
occasionally, and 1% are the lifeblood.    (025)

Rather than worry about people munging up the place with random/bugus  
edits, I'd tightly control the write ability to only known good  
players.  And make it easy for someone who's interested in being an  
active contributor to petition for editing rights.    (026)

Not to forget... 99.999% of the people in an organization will have  
zero interest in the finer points of an ontology (matter of fact  
they'll likely be pissed at you for using a big word that you never  
adequately define in order to intimidate them)... what they want is  
an easily accessible resource where they can find useful  
information.  A major ease-of-use challenge will be to package said  
ontology in a non-threatening/intimidating way for the audience who  
has zero interest in the finer points of ontologies.    (027)

YOU many be fascinated with the nuances of language... your potential  
audience/customers has zero interest.    (028)

Think what would happen if you told someone who wanted to learn to  
drive a car.... "We'll start with learning how the transmission  
works... then we'll move on to how valve faces are machined..."   NO  
customers.    (029)

Just my  two cents... aren't you glad you  asked?    (030)

David Eddy
deddy@xxxxxxxxxxxxx    (031)

781-455-0949    (032)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (033)

<Prev in Thread] Current Thread [Next in Thread>