ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Is there something I missed?

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Adrian Walker <adriandwalker@xxxxxxxxx>
Date: Sun, 1 Feb 2009 16:38:35 -0500
Message-id: <1e89d6a40902011338w2e6b4e86vcdebbb68143c0bdb@xxxxxxxxxxxxxx>
Hi John --

Interesting overview of Relational Databases vs RDF.  You wrote:

The problem with SPARQL is that RDF is not designed to support
indexing.  The people who built so-called "triple stores" say
that they're efficient because they run in RAM.  But any DB that
fits in RAM is a toy.  If you take a non-toy DB and map it to
RDF, it doesn't fit in RAM.  Furthermore, those triples aren't
indexed, and they're not designed to be paged in an orderly
fashion.  The result is endless disk thrashing.

Let me play devil's advocate for RDF for a moment.

One of the many ways of writing RDF is to take the N3 format and expand it to make all the triples explicit. (No shorthand with commas or colons.[1]).  Call this RDFe.

Now put your RDFe in a relational DBMS table, and index it in all possibly useful ways.  Maybe also do some design that splits the table on classes of the second items in the triples.

Yes, applications will cause the DBMS to spend a lot of time re-assembling n-ary relations from from the explicit triples.  However, it's likely that using high end hardware plus scads of memory can yield, say, perfomance similar to that of the original relational database on a laptop.  And in some cases the n-ary relations can be precomputed.

Whether RDF(e) gives you functionality that's worth the above effort is of course another question.

                               Cheers,  -- Adrian

[1]  See e.g. www.reengineeringllc.com/demo_agents/RDFQueryLangComparison1.agent

Internet Business Logic
A Wiki and SOA Endpoint for Executable Open Vocabulary English over SQL and RDF
Online at www.reengineeringllc.com    Shared use is free

Adrian Walker
Reengineering



On Sun, Feb 1, 2009 at 2:55 PM, John F. Sowa <sowa@xxxxxxxxxxx> wrote:
Dear Alex,

 > It seems relationship "DL versus Prolog" is more complicated than
 > I thought ( http://arxiv.org/ftp/arxiv/papers/0711/0711.3419.pdf )

Mitre is one of the companies that I was thinking of when I said that
many large groups translate RDF and OWL to Prolog.  Following is an
excerpt from the abstract of that paper:

   ... we are developing the SWORIER system, which enables efficient
   automated reasoning on ontologies and rules, by translating all
   of them into Prolog and adding a set of general rules that properly
   capture the semantics of OWL. We have also enabled the user to make
   dynamic changes on the fly, at run time. This work addresses several
   of the concerns expressed in previous work, such as negation,
   complementary classes, disjunctive heads, and cardinality, and it
   discusses alternative approaches for dealing with inconsistencies
   in the knowledge base.

The Mitre approach has many interesting features, and I recommend that
paper as an important survey.  At VivoMind, we also use Prolog, but
we have a different approach that involves multiple agents that use
a variety of reasoning methods, both logical and analogical.

 > May be there is a direct comparison of reasoning "power" in different
 > systems?  Something like reasoners competition;)

Following is a paper I wrote about some of the issues:

   http://www.jfsowa.com/pubs/fflogic.pdf
   Fads and fallacies about logic

If you are interested in competitions among theorem provers, you might
look at the Thousands of Problems for Theorem Provers (TPTP):

   http://www.tptp.org/

 > In our project we have just two goals:
 >    - transform RDB to KB. (Because knowledge is better than data;)
 >    - rebuild our application system on top of KB reasoning engine.
 > (and here OWL-DL right now is mandatory;)

Whenever anybody throws around words like "knowledge" and "data",
that is a sign of a smoke screen designed to confuse the pointy
haired bosses who know nothing about either one.  Nearly *every*
large knowledge base (such as Cyc, which is the largest one
ever implemented) is designed to use a relational database to
store the low-level facts.

Whenever OWL-DL is declared "mandatory", that is a sign that some
pointy-haired boss was sufficiently confused to drink the Kool-Aid.
There are horror stories about large RDB systems in the US that
were "mandated" to be translated to RDF with disastrous results.

The basic issue is that RDBs use indexes to find the relevant
data in logarithmic time.  When the index cannot be used because
it is necessary to process an entire column, the RDB pages the
data in an orderly fashion.

The problem with SPARQL is that RDF is not designed to support
indexing.  The people who built so-called "triple stores" say
that they're efficient because they run in RAM.  But any DB that
fits in RAM is a toy.  If you take a non-toy DB and map it to
RDF, it doesn't fit in RAM.  Furthermore, those triples aren't
indexed, and they're not designed to be paged in an orderly
fashion.  The result is endless disk thrashing.

If I had a pointy-haired boss who mandated the translation
of an RDB to RDF, I would immediately send my resume to every
reasonable employer -- because you can be certain that this
system is going to be a disaster.  And the PHB is going to save
himself by blaming the failure on the people who work for him.


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)

<Prev in Thread] Current Thread [Next in Thread>