[Top] [All Lists]

Re: [ontolog-forum] Data, Silos, Interoperability, and Agility

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: Kingsley Idehen <kidehen@xxxxxxxxxxxxxx>
Date: Tue, 24 Sep 2013 17:25:04 -0400
Message-id: <52420330.7000807@xxxxxxxxxxxxxx>
On 9/24/13 12:14 PM, John F Sowa wrote:
> On 9/24/2013 8:51 AM, Kingsley Idehen wrote:
>> The WHERE CLAUSE of SQL isn't a nirvana. It's a slot.
> More precisely, it has the expressive power of FOL.  It supports
> the Boolean operators AND, OR, and NOT.  With the existential
> quantifier and the option of nested WHERE clauses, that supports
> first-order logic (but without functional expressions).
> That expressive power is the equivalent of Datalog for queries
> and for views.  A Datalog rule is the equivalent of a virtual
> relation in SQL.    (01)

This is all true, but there is a practically issue with regards to 
contemporary data access, integration, and management challenges.    (02)

The expressive power of FOL needs to be meshed with the data being 
processed. The data being processed is now disparately located, 
heterogeneously shaped, voluminous, and volatile.    (03)

RDF based Linked Data enables the fusion of logic, data, and data 
access. Using HTTP URIs as identifiers makes a big difference to many 
challenges in this regard e.g., pointing to data across data spaces.    (04)

SPARQL enables one query and produce query solutions against RDF based 
Linked Data. It also enables data access and integration that SQL simply 
cannot handle in any practical manner [1][2][3][4].    (05)

> When applied to any RDB (or the equivalent in RDF), those queries
> can be evaluated in polynomial time.    (06)

Not disputing that. The issue is how one references and de-references 
data as part of query solution production processing pipeline. SQL just 
doesn't cut it. We surmount this issue by having a SQL and SPARQL hybrid 
DBMS. In short, bar the OLTP side of things, we have SPARQL and SQL 
performing SQL oriented benchmarks where the results differentials are 
utterly insignificant [3].    (07)

>> My issue is all about what you can do within the slot when you
>> have an entity relationship model oriented graph endowed with
>> machine-comprehensible semantics.
> To say that SQL or Datalog expressions are just slots is highly
> misleading.    (08)

I am not trying to imply that, since that would indeed be misleading.    (09)

I am claiming that the SQL WHERE CLAUSE provides a slot into which a lot 
can be placed, as per my examples where I have a SPARQL query in the 
WHERE CLAUSE of a SQL statement.    (010)

>   But I agree that E-R models are very useful.  In fact,
> type hierarchies and E-R diagrams are the two most widely used UML
> diagrams.  Together, they can represent most published OWL ontologies.
> The other UML diagrams go far beyond OWL.  And programmers know them.
>> SQL RDBMS products don't support Reference types as native data
>> types  -- its a silo by definition with an inability to semantically
>> intermingle  data across engines from different vendors
> The limitations of those products were recognized for over 30 years --
> by all the major DB researchers, developers, and users.    (011)

The issue where recognized for sure. What actually happened is a whole 
different matter though!    (012)

>   That was the
> theme of the ANSI/SPARC conceptual schema in 1978.  But that work ended
> in a technical report instead of a standard -- because certain vendors
> felt that standards would threaten their market dominance.    (013)

Exactly!    (014)

>> I specifically provided a SPASQL (SPARQL inside SQL) example to showcase
>> what can be achieved in the SQL WHERE CLAUSE when you leverage SPARQL.
>> If there is some concrete alternative to what I provided, I am sure
>> someone could post and example just as I did.
> Prolog and other logic-based languages have been accessing both
> relational and graph-based databases for over 30 years.    (015)

Yes, they have, of course they have. But here's what they don't do (even 
at the time of typing this response): provide a mechanism for working 
with contemporary data which is disparately located, heterogeneously 
shaped, voluminous, and volatile. They don't leverage URIs as the 
Reference type mechanism, all de-reference functions are product 
specific etc..    (016)

The products you mention simply don't deliver on the critical 
interoperability requirement that's ground-zero in today's so-called 
"big data" world etc..    (017)

A fast and intelligent silo is still a silo. The world doesn't need more 
silos, it needs URIs to make data flow across data spaces with access 
controls driven by entity relationship semantics.
> Experian, for example, uses Prolog on huge data volumes to evaluate
> everybody's credit worthiness.
>   They process resources of every kind.
> Unfortunately, Experian, by the nature of their business, are too
> secretive to tell anybody exactly what they do and how they do it.    (018)

Just another silo :-)    (019)

They could actually provide controlled access to relevant data but they 
won't. Naturally, that's partly to do with the nature of their business 
and partly to do with the tools they are using i.e., they need to be 
able to cost-effectively produce Linked Data as an integral part of 
their API mechanism for interoperability with others (even if this is a 
select club of partners etc..).
>> Sharing this presentation I stumbled upon based on its relevance to
>> a variety of discussion threads on this list.
>> [1] 
> I strongly support LOD and open standards.  But this proposal claims
> that NL content should be mapped to RDF + OWL.  That teeny-tiny *silo*
> may be useful for DBpedia -- but it's hopelessly inadequate for NLP.    (020)

I look at all of these endeavors as tiny puzzle pieces that are part of 
a bigger jigsaw puzzle. Each makes its own contribution to a variety of 
Webs (public or private). I think the proposal shows how they've mapped 
it to RDF without implying its the sole option for the broad realm of NLP.    (021)

Links:    (022)

[1] http://bit.ly/UydU9t -- Simple SPARQL based Data Integration
[2] http://bit.ly/Y6TIfs --  SPARQL based Data Integration using 
InverseFunctional relations
[3] http://bit.ly/WmKlJ0 -- SPARQL 1.1 reasoning capabilities
[4] http://bit.ly/Wk19i4 -- SPARQL based data cleansing via reasoning.    (023)

> John
>       (024)

--     (025)

Regards,    (026)

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen    (027)

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>