[Top] [All Lists]

Re: [ontolog-forum] Semantic Enterprise Architecture

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: Kingsley Idehen <kidehen@xxxxxxxxxxxxxx>
Date: Sat, 04 Sep 2010 12:47:54 -0400
Message-id: <4C82783A.3060102@xxxxxxxxxxxxxx>
>> But nobody has a clue about what methods of processing LOD will prove
>> to be the most useful and successful over the next 5 to 10 years.
> Thing is linked open data (LOD) isn't the same thing as linked data.
> Let's assume you mean publicly available open data published using the
> principles in TimBL's famous meme, in this case, handling this data at
> Web Scale is the major challenge at hand. By this I mean the ability to
> do the following:
> 1. Faceted Browsing (using HTML pages for instance) over masses of data
> (DBpedia is small re. scale I have in mind, but that challenges most)
> 2. Precision Find using SPARQL where patterns include "?p" (any
> predicate) thereby generating extremely wide columns in RDBMS engines
> (typically performing self joins) .    (01)

Important clarification: I meant to say: wide rows with many columns re. 
"?p" in SPARQL patterns. A very important distinction re. context above.    (02)

I could have also added the following to the list of challenges above:    (03)

1. Ordering query (SQL or SPARQL)  results by Entity Rank (combining 
link coefficients and full text scores)
2. Transitivity and graph traversal oriented path expressions
3. Backward chained inference applied to queries incorporating the items 
above.    (04)

>>> MySQL doesn't cut it at all. Neither does Oracle or any other
>>> traditional RDBMS. You need a hybrid DBMS e.g OpenLink Virtuoso.
>> First of all, Oracle *is* a very efficient hybrid system.  They
>> accept both SQL and SPARQL as query languages, and they store
>> the data in tables or networks, as appropriate.
> Yes, and they can't deal with the two fundamental problems I outline
> above. This is something we addressed from the get go re. Virtuoso, and
> isn't even part of what you will be seeing in the imminent paper I
> mentioned in my earlier post.
> To deal with #1 and #2 we had to do the following:
> 1. Implement a Breakup mechanism for wide columns    (05)

Breaking up the wide rows with many columns (from self joins). This is 
critical for any SPARQL engine implemented using a hybrid model.    (06)

Anyway, the white paper delves into these matters and includes TPC-H 
benchmark results.    (07)

[SNIP}    (08)

--     (09)

Regards,    (010)

Kingsley Idehen 
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen    (011)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (012)

<Prev in Thread] Current Thread [Next in Thread>