[Top] [All Lists]

Re: [ontolog-forum] Data, Silos, Interoperability, and Agility

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: Kingsley Idehen <kidehen@xxxxxxxxxxxxxx>
Date: Fri, 27 Sep 2013 13:39:47 -0400
Message-id: <5245C2E3.2050403@xxxxxxxxxxxxxx>
On 9/27/13 10:00 AM, John F Sowa wrote:
> Data issues are *no* longer nascent and somewhat lost is marketing babble
> spewed by DBMS vendors. My data and applications that help me work with my
> data are two distinct things.
I agree with the second sentence.  But data issues had been analyzed
by linguists and logicians since Aristotle.


  They certainly weren't
"nascent" in the 1960s. 

To what degree did data-silo-fication matter (in the minds or users and IT decision makers) prior to the ubiquitous Web and Internet explosions?

ODBC (Open Database Connectivity) which was Microsoft's implementation (plus some extensions) of the SQL Access Group (SAG which is now part of X/Open) offered the first broadly adopted and industry-wide standard for data-de-silofication targeting SQL RDBMS engines. This happened in the early 90's at a time when RDBMS silos started challenging the productivity of information- and knowledge-workers.
 And I consider the SW hype as misleading
and misguided as any marketing babble.

Yes, it has contributed to this problem too. I've commented many times about "poor narratives" associated with RDF and "The Semantic Web".

> The need to loosely couple Data and Applications (including DBMS
> engines) *wasn't* so clear in the past.
Don't forget the punched-card machines.  By the 1950s, there had been
half a century of experience with "loosely coupled" decks of cards
that were processed by a wide variety of different applications.

I go back further than that. Mankind has captured observations using a variety of media that are loosely coupled with computing devices. My fundamental point is that Data (circa. 2013) isn't instinctively decoupled from tools (applications) such as DBMS products. Most end-users assume they can't do anything with data without starting an DBMS client application, for instance.

Modulo "poor narratives" RDF (plus RDFS and OWL) does provide a very useful mechanism for demonstrating and exploiting the following:

1. separating data (creation and publication) from data management, business process, and visualization/presentation oriented applications

2. data flow across silos when HTTP URIs are leveraged as the entity denotation mechanism

3. data virtualization across silos that leverages relation semantics (e.g., rdfs:subClassOf, rdfs:subPropertyOf, owl:sameAs, owl:InverseFunctionalProperty, owl:inverseOf, owl:SymmetricalProperty, owl:equivalentClass, and owl:equivalentProperty) .

1-3 are basically all about the result of fusing web-like structured data representation with webby logic (or Blogic [1] as coined by Pat Hayes).

Every large business of any kind had huge volumes of cards, which
they rapidly computerized in the '50s and '60s.  The "loose coupling"
of cards inspired them to design DBs with a similar level of loose
coupling.  That was in 1963 for GE's network DB, and 1966 for IBMs
hierarchical IMS.  Each record in IMS was a "virtual card".

In fact, they used to mail those cards to customers, who wrote more
information on them.  Toll collectors would hand punched cards to
the drivers -- with the warning "Do not bend, fold, or mutilate".

The decks of card were organized in columns, which were printed
as tables or spreadsheets.  In 1969, Ted Codd recognized that
those tables could be represented as relations in logic.  That
led to a very clear decoupling.

Yes, but what happened after that, following the rise of the SQL RDBMS?

Meanwhile, the AI community had been working on logic-based
representations since the 1950s.  For example, see John McCarthy's
"Basis for a mathematical theory of computation" from 1961:


That paper talks about n-tuples for representing functions and
relations.  McCarthy also used his "conditional expressions",
which he introduced into LISP in the late 1950s.  He also worked
with the Algol committee and proposed if-then-else as the
"syntactic sugar" for conditionals.

By the end of the '70s, the combination of the practical and
theoretical experience led to a huge literature about the many
practical and theoretical issues with DBs and KBs.

Yes, and we ended up with a SQL RDBMS dominated realm which simply accelerated the construction of data silos while also totally destroying the end-user's ability to understand what Data actually is or was. Basically, we ended up with the pervasive misconception that structured data is data that can be processed by a SQL RDBMS. Most ironic of all, most RDBMS products don't even have in-built ability to import data from CSV files, so the utility of a CSV file for data capture and interchange was lost on many (including many Semantic Web folks!).

In comparison, I would call the SW hype naive, provincial, and
based on wishful thinking that was untested against reality.

Methinks, too harsh, even on its very worst "poor narrative" day :-)

Anyway, bearing in mind the existence of the LOD cloud and broad use of Semantic Web stack tools, what is it that you would like to fix? Personally, I don't see a disconnect between what you seek and what we have. All I see are poor examples and bad narratives, from time to time.

As I stated in a prior post, your display notation for conceptual graphs trumps the one currently banded around in most Semantic Web presentations associated with RDF. I use a variant of your notation when I teach these concepts:

[<#StatementSubject>] -->-- [<#statementPredicate>] -->-- [<#StatementObject>] | ["literal statement object"] .

Which then allows me triangulate:

[<#statementPredicate>] -->-- [<#predicateDomain>] -->-- [<#StatementPredicateDomain>]
                    |-- [<#predicateRange>] -->-- [<#StatementPredicateRange>] .

Back to:

[<#StatementSubject>] -->-- [<#type>] -->-- [<#StatementPredicateDomain>] .

[<#StatementObject>] -->-- [<#type>] -->-- [<#StatementPredicateRange>] .

And when they get really excited, I pack these into a "Situation" box etc..


[1] http://slidesha.re/18CtxGK -- Blogic .




Kingsley Idehen	      
Founder & CEO 
OpenLink Software     
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>