ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Ontology-based database integration

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Fri, 23 Oct 2009 09:03:48 -0400
Message-id: <4AE1A9B4.6080901@xxxxxxxxxxx>
Dear Matthew, Ian, and Erick,    (01)

It's good to see notes that get into the details of successful projects
and how they work.  But I'd like to start with a point that Erick made,
which reflects a common misconception:  There is something new about
Semantic Web technology.    (02)

EA> I was wondering how hard or how easy was to "convince" your
 > people (ranging from supervisors/managers to end users) and/or
 > budget keepers or funding agencies ... to embark on a relatively
 > "new" (not that popularised)technology (referring to ontologies,
 > Semantic Web, standards like ISO 15926, etc...)...
 >
 > Users are more than convinced that having a controlled vocabulary
 > and standards are a must.  Nevertheless, such integration could
 > also be achieved by using "classic" technologies (relational
 > databases) and still be able to exchange complex information.    (03)

Please note the slides that Matthew presented:    (04)

MW> The first project Ian mentions was a data model of Shell's
 > Downstream Business. I did a presentation of this at the DAMA
 > conference in London a few years ago.
 >
http://www.matthew-west.org.uk/documents/DAMA%20Developing%20Shell's%20Downstream%20Data%20Model.ppt    (05)

The Express language, which is the major knowledge representation
language for that project, has the expressive power of first-order
logic.  That is a no-no according to Semantic Web dogma.  Also note
that many of the diagrams in those slides are based on P. P. Chen's
Entity-Relation paper of 1976.  Another slide shows the 6-column
Zachman framework.  The original paper by John Zachman from 1987
only had 3 columns.  The 6-column version was first published in
a paper by John Zachman and me in 1992:    (06)

    http://www.jfsowa.com/pubs/sowazach.pdf    (07)

The idea of using the 6 question words of English at the head of the
columns was one that I suggested.  But I don't claim credit for it.
Aristotle was the one who first used the Greek question words as
the names of his categories, such as the-what, the-where, the-when,
the-what-kind, the-how-much, etc.    (08)

When Cicero was translating Aristotle into Latin, he had to face the
problem that Latin didn't have the equivalent of 'the'.  So he took
the Latin question words like 'quale' for 'what kind' and 'quantum'
for 'how much' and added Latin endings to get 'qualitas' and
'quantitas'.    (09)

IB> I'm not sure I've seen an application yet that really demonstrates
 > the power of an ontology over and above a traditional database system.
 > I've seen plenty of ontology demos and small scale stuff that's really
 > clever (and could never be done with a traditional approach), but
 > hardly anything on a production scale.    (010)

Unfortunately, that is true.  Part of the reason why traditional
database design isn't bad is that the designers have been using UML
diagrams (and other tools that came out of the 1970s and '80s).
They include E-R diagrams for entities and relationships and
type hierarchies that express the most widely used part of
Description Logics.  And they are very well integrated with the
mainstream development methodologies for databases and software.    (011)

For practical DB and software development, good diagrams, such
as those used in UML, Express, and other methodologies, provide
much of the benefit of an ontology.  If you combine them with
terminologies defined in carefully written English, you get
80% of the benefit of an ontology, but written in a notation
that the developers can actually read and understand.    (012)

As I've said many times, the SemWeb would have been adopted and
integrated into the mainstream of production technology very
quickly if they had taken one obvious step:  provide a smooth
transition from mainstream technology to the SemWeb.  They
should have integrated Description Logics with UML and RDBs.
Nearly every commercial web site incorporates a relational DB,
and it was a major, major blunder for the SemWeb to ignore them.    (013)

The idea of the conceptual schema (which was literally "the
formalization of a conceptualization") was proposed in the
mid 1970s.  The need for types in databases was also widely
recognized in the mid 1970s.  The DL technology was invented
in the mid 1970s for AI.  In the 1980s, many people in the AI
and DB fields (including Ted Codd, Pat Hayes, and me) went to
joint AI + DB conferences where these issues were discussed
and analyzed.  The kinds of issues discussed then were almost
indistinguishable from the topics discussed in this forum.    (014)

The following points were very well understood in the 1970s:    (015)

  1. The logical independence of the semantics of the data from
     the way it happens to be stored (as tables or networks).    (016)

  2. The fact that the labels at the top of columns in a
     relational table are *not* types.    (017)

  3. The need to integrate a true type hierarchy with DBs
     of any kind (independently of the way they are stored).    (018)

  4. The need to use a formal logic to define the semantics
     of the data from the point of view of the user.    (019)

The major stumbling block to getting these points into SQL
had always been the vendors (primarily the largest one) who
did not want to change their software (primarily because any
change would give the smaller vendors a chance to catch up).    (020)

When I complain about the Semantic Web, I am thoroughly
disgusted by the failure to correct a major blunder that
had been fully understood and analyzed for over 30 years.
The "three amigos" who designed the UML diagrams recognized
the need to bring together the type hierarchy, the E-R
diagrams, relational databases, and programming methodologies.    (021)

But instead of building on that foundation, the SemWebbers
threw the entire field of semantics 30 years *backwards*
into the dark ages of fighting about tables vs. networks.    (022)

Following is my recommendation for an upward compatible
growth path that integrates *all* technology:    (023)

    http://www.jfsowa.com/talks/cnl4ss.pdf    (024)

That talk has 128 slides.  But I'd like to emphasize the
following six slides:    (025)

  1. Skip to slides 36 to 38 for the conceptual schema.    (026)

  2. Then skip to slides 89 to 91 for the Semantic Web
     and a migration path to the future.    (027)

John Sowa    (028)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (029)

<Prev in Thread] Current Thread [Next in Thread>