Len Yabloko wrote: (01)
> I am sure you are aware of a deeper issue that make SQL and Graph DB
> different, that is complete axiomatization of SQL as well as referential
> Reasoning is the ultimate application for both, with query being only one
>form of it. (02)
In the last 20 years, it seems to have become a requirement for somebody
to "completely axiomatize" every computational modeling language, so as
not to be left behind in the claims of "formal grounding". Java is also
formally grounded. So what? (03)
What graph DBs and relational DBs have in common is their ability to
assist in retrieving information for decision purposes. As Len
suggests, the great advantage of RDBs (and SQL compliance) is that they
regularize the process of information creation and information update,
thus guaranteeing consistency and reliability of most retrievals. The
great advantage of graph DBs is flexiblity -- the information in them
can grow rapidly and somewhat unpredictably -- but the consequence is
that it is very difficult to prune outdated information and to prevent
the addition of incomplete and inconsistent information that may confuse
critical applications. (I think this is John's point, expressed from a
different viewpoint.) (04)
Both of these technologies are useful for big databases. What should
decide the choice of technology is what you want to do with the data.
There is also a lot of practical business reasoning that now uses a
hybrid database support. The original operational data is maintained in
the RDBs, and supports business operations effectively and reliably.
Parts of that data are converted regularly or ad hoc to graph DB forms
to support other associative queries and queries that require inferences. (05)
It takes a certain amount of enlightenment to see the value in the new
screwdriver alongside your tried-and-true hammer, and to realize at the
same time that the guy who tells you his new screwdriver will completely
replace hammers is a nitwit. The IT industry as a whole seems to have a
really hard time seeing technologies as incremental and complementary
rather than revolutionary. (06)
Edward J. Barkmeyer Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263 Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263 Cel: +1 240-672-5800 (08)
"The opinions expressed above do not reflect consensus of NIST,
and have not been reviewed by any Government authority." (09)
> I'd just like to comment on the following point:
> >> I'm sure that things like Neo4J can be useful for many applications,
> >> but if you really have large graphs and large numbers of graphs,
> >> you need to index the data. And there are some very good methods
> >> for indexing finding exact and approximate matches.
> > DN: Yes. This is why they are used for big data sets like FaceBook and
> > Google. Each user starts their social graph with an index of themselves.
> > The traversal then finds other nodes related to that user. The advantage
> > here is also the flexibility offered by graph databases IMO. RDBMS is
> > much more inflexible as the schema has to be changed to add a single
> > property for an object. Graph DB's allow one property to be added then,
> That is the kind of application for which graph traversal is good:
> 1. You have a pointer to a clearly defined starting point, such as
> a web page for a specific individual. For info about a specific
> individual, FaceBook is a *structured* DB for which they have
> predefined specific categories that they use for well-worn
> branches in the graph. Most paths are short, and people seldom
> ask complex queries.
> 2. RDBMS is a different kind of structured data, and I certainly do not
> intend to support all the encrustations and limitations that have
> evolved over the years in RDBMS. Nor do I endorse all the quirks
> and peculiarities of SQL, which I used to call "the worst notation
> for logic that has ever been inflicted on innocent users" -- but
> that was before I saw RDF and OWL.
> 3. As for physical layout, graphs and tables are two logically
> equivalent choices -- anything you can store in one can be mapped
> to the other. That is an implementation choice. In terms of
> matrices, a densely populated matrix is best stored in table form,
> and a sparse matrix is best stored in a graph form.
> 4. Ideally, the users (both programmers and end users) should not need
> to know or care about the implementation. For all its flaws, SQL is
> far better than graph traversal for complex queries. Most object
> oriented DBs offer programmers a choice of SQL vs native path-based
> methods. And most of them choose SQL.
> 5. I also agree that flexibility is extremely important. A major
> complaint about RDBMS is the need for a DB administrator to define
> a schema in advance. But note that casual users love *spreadsheets*
> for dense data. Their table headings are a rudimentary, easy-to-
> change schema, and users love the simplicity of a rectangular grid.
> 6. None of the issues listed above are new. They were very thoroughly
> discussed and analyzed during the "DB wars" of the 1970s, and there
> has been 40 more years of R & D on all those issues. My major
> complaint about the SW is that they ignored all that R & D and
> forced a one-size-fits-all format on everybody.
> 7. The SW notion of interoperability is to provide a mapping from
> RDB to RDF. But that is the *worst conceivable* approach. It is
> unbelievably inefficient for dense data, and it is vastly worse
> than SQL for complex queries. If anybody had suggested that method
> at a VLDB conference in the 1980s or '90s they would have been
> laughed out of the room in disgrace.
> > DN: I think a lot of the issue you note are also solved by better
> > modelling and indexing. Nevertheless, it will be very interesting to
> > watch the 2012 growth of these Graph DB's.
> Graph DBs and RDBMS are both designed for professional programmers
> who are forced to dig into the implementation details. The 40+ years
> of R & D on databases focused on implementation-independent methods.
> You don't even need any research studies to see why application
> programmers prefer JSON to RDF -- it's equally good for representing
> graphs, tables, or trees.
> I agree that for optimum performance on very large applications,
> such as FaceBook or Amazon, professional systems programmers need
> to get down into the bowels of the implementation. Those systems
> provide good interfaces for casual users who go to their sites.
> But application programmers should *never* need to get into the
> details of the implementation. And interoperability across
> independently developed systems should *always* be at a level
> that is independent of the implementation. That is the point
> of the following paper and slides:
> Future directions in semantic systems
> Integrating Semantic Systems
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (011)