ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Ontology-based database integration

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Fri, 09 Oct 2009 22:55:41 -0400
Message-id: <4ACFF7AD.4050209@xxxxxxxxxxx>
Paola, Kingsley, and Cecil,    (01)

Before commenting on your notes, I want to apologize for an
emotional moment when I used the phrase "profoundly foolish"
about people who are intelligent, but had not given fair
support for the wide range of available DB technologies.    (02)

PDM> ... as I am learning the ropes, I see some benefits/advantages
 > of SPARQL (over Sql for example) It would be great if you could
 > produce some data to backup your statement above?    (03)

My major concern was not so much about the adoption of SPARQL as
one option among many, but the choice of that approach and the
use of triple stores in preference to the enormous range of DB
technologies that had been designed, developed, and implemented
over the past forty years.    (04)

I have no particular love for SQL, and I had been disappointed
that much better notations for relational DBs had been ignored.
My personal preference for a DB query, constraint, and rule
language would be Datalog, which can be viewed as a simplified
version of Prolog that is specialized for DB access.  In fact,
when Ted Codd, the inventor of relational DBs, first saw Prolog,
his immediate reaction was "I wish I had invented that."    (05)

But to support my point that the notation for a query language is
independent of the way the data is stored, I'd like to mention a
discussion I had a few years ago with the developers of Objectivity,
which is an object-oriented database that stores the data as graphs.
For the query language, they supported both SQL and a path-based
query language similar to SPARQL.  However, most of their users
preferred to use SQL because they were more familiar with it. So
the Objectivity developers supported SQL as well as other languages
for accessing their database.  Following is their FAQ sheet:    (06)

    http://www.objectivity.com/pages/objectivity/faq.asp    (07)

But one of my major criticisms of the Semantic Web is that the W3C
had not provided better integration with relational DBs, since nearly
every commercial web site, both large and small, was built around a
relational DB.  For a summary, see slides 89 and 90 of the following:    (08)

    http://www.jfsowa.com/talks/cnl4ss.pdf    (09)

KI> My response is about a single point: you can have a multi-model
 > DBMS engine. One capable of being optimized for scenarios specific
 > to a given model. In this case Graph vs Relational.  I think we
 > actually agree on the concept of multi-model DBMS engines, as
 > opposed to one model fits all, right?    (010)

Yes, definitely.  I had mistakenly assumed that you were arguing
in support of SPARQL and triple stores in preference to other
methods for DBMS.  One reason why I like Datalog is that it is
a clean notation (much cleaner than SQL), which can be supported
by any kind of underlying DB organization.    (011)

SPARQL is a step backwards to the bad old days of CODASYL DBTG
and the database wars between Ted Codd and Charlie Bachman.
Some people think that because I use conceptual graphs I would
prefer a path-based access method.  But that is definitely false.    (012)

My favorite graph-based approach is to give the system a query
graph and say "Find all matches to the query graph within a
given semantic distance, and I don't care how you do it."
And by the way, a Datalog expression is a graph, after you
parse it and treat the variables as cross-links.    (013)

KI> We shouldn't write-off anything, its about using the best
 > combination of tools for the problem at hand. In this case, the
 > trick is to combine technology and techniques from a range of
 > realms: raw DBMS and Middleware.    (014)

I'm happy with that statement.    (015)

CL> These messages are not dumped to a database then processed. They
 > are processed in real-time, in memory, off the data stream. If you
 > had time to database them, then you are not in need of real-time
 > analysis. That is not the use case I am talking about here, or
 > in most decision support that we do in healthcare.    (016)

You can do all kinds of just-in-time analysis and optimization
on streaming data.  We (at VivoMind) do that on high-speed data
streams, and we definitely do not want some programmer to try
to outguess our optimizer by specifying a path by a SPARQL query.
The automatic optimization is much better, since it's tailored
to the actual data stream, not to somebody's preconceived
idea of what the data stream should be.    (017)

KI> As for your processing off a stream, what point are you trying
 > to make about how the data is going to be accessed, reconciled,
 > and meshed?  Where does thinking occur?  Where does remembering
 > occur? What's the grey matter? How does the machine construct
 > frames of references when dealing with these huge streams of
 > disparate data?    (018)

What we do at VivoMind is to represent everything in conceptual
graphs and to index the graphs by a Cognitive Signature (TM) as
they arrive.  When new graphs arrive, we compute their Cognitive
Signatures, check whether we ever saw anything similar, and
retrieve the previous cases.    (019)

The time to find all graphs within a given semantic distance of
a query graph varies logarithmically with the number of graphs.
The time to find one graph out of a billion takes three times as
long as finding one out of a thousand.  See slides 10 to 13 of    (020)

    http://www.jfsowa.com/talks/pursue.pdf    (021)

John    (022)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (023)

<Prev in Thread] Current Thread [Next in Thread>