Hello Ed, (01)
I agree wholeheartedly. RDF and SPARQL make data integration easier (without
solving the fundamental issues of course). But they are a bad option for data
storage because maintaining consistency is so difficult (think about deleting
a row or transactions). There should be a big warning sign above the SPARQL
UPDATE standard for those who think relational databases are legacy. But I
think nobody in the Semantic Web video or on this list actually said that ? (02)
Regards, (03)
Michael Brunnbauer (04)
On Thu, Sep 12, 2013 at 04:27:49PM +0000, Barkmeyer, Edward J wrote:
> First of all, I agree strongly with John Sowa (for a change). "Extended
>RDBMS" are what the world runs on, and they have survived two fad replacement
>technologies. RDF will simply be the third. As John says, RDF can be a data
>representation at the interface to a data repository, but it provides no
>fundamentally different value. SPARQL is just another query language, and a
>good "extended RDBMS" can run SPARQL queries.
>
> The children who think they are reinventing data access with RDF triple
>stores need to understand that they are comparing a 5th normal form relational
>database with a 3rd normal form relational database, and 5th normal form is a
>more restricted specialization of 3rd normal form. (Restating the formal
>mathematical definitions) In 3NF, the table usually names a class, some set of
>columns identifies a subject that is an instance of the class, and other sets
>of one or more columns state individual facts about that subject. In 5th
>normal form, the table usually names a property, one column designates the
>subject, and the other column, if any, designates a value of the property for
>that subject. RDF is 5th normal form. The advantage of 5th normal form is
>that it facilitates joins in multidatabases, which is precisely the intent of
>its use in the erstwhile "Semantic Web" and "Linked Open Data". The
>disadvantage of 5th normal form is that you have to do a lot of joins to
>answer simple queries, and joins are expensive in large databases. As
>Stonebraker and others in the mid-1980s pointed out, it is useful to convert
>selected 3rd normal form tables to 5th normal form for query-specific
>multidatabase joins ("distributed queries"), in order to deal with the problem
>of "partitioning" in multidatabases (an individual database can have some of
>the facts about a given set of things, or all of the facts about a subset of
>the things, and multiple database can overlap in both ways). In
>latest-and-greatest terms that is to say, it is useful to convert selected
>database rows to RDF triples in order to answer certain queries.
>
> On the other hand, RDF is particularly clumsy for dealing with data that is
>best represented in 4th normal form, such as a statement of a quantified
>property. (In 4NF, a row states one fact, but the identifiers for the subject
>and the object can be multiple columns.) The 'quantity' object is represented
>as two 'columns': number and unit. In RDF/5NF the quantity becomes a database
>key (ooh IRI, but ad hoc) with two more 'assertions' that relate it to a
>number and a unit. (Since engineering is what my division does, this is
>important to us.)
>
> As John says, the right way forward is to see RDF as a standard
>representation for 5th normal form relational rows, and SPARQL as a query
>language that augments the capabilities of SQL (not as a replacement for it).
>The real problem that neither solves is to get agreement on vocabulary and on
>the interpretation of individual data.
>
> Now, as to persons in industry who think their relational database systems
>are "legacies", some of them are right. Others are why the IT industry is
>permitted to waste billions of dollars/euros/yen on fad technologies and
>repainting of old ideas.
>
> What makes a system a "legacy" is not the technology used, unless that
>technology is no longer supported, but rather its relevance to the way you
>currently do business. There were a lot more database designers in the 1980s
>and 1990s than there were competent modelers who could build properly
>extensible conceptual schemas. So, a lot of the purpose-built databases were
>brittle designs at the outset and have become as much a part of the problem as
>the business practice moved on. It is poor fault analysis to say this is a
>consequence of the technology, without determining that the fault was not in
>the design. ("A poor workman blames his tools.") (If your database and your
>processing software assume that all of your products will be sold in barrels,
>by volume, and you subsequently get into the business of synthetic fibre,
>which is sold in spools, by length, is your legacy problem the fact that you
>used an RDBMS?)
>
> Doing a new poor design with new technology just creates more expense and a
>new legacy system with a shorter lifecycle. This is particularly the case
>when the would-be designers have little prior experience in what makes a good
>and flexible model, in the mistaken belief that their new technology will make
>up for their personal incompetence. (I have seen a lot of weak or downright
>bad OWL models, and the problem is not in the technology.) Database design,
>whatever the target technology, is a SKILL. You have to learn the skill, and
>it involves understanding the technology, becoming familiar with the business
>problem space and the intended usages, and the learning the art of
>"abstraction", the "art of design". In most failed database projects, the
>devil is not in the details, but rather in the overall conceptualization, or
>the lack of one.
>
> The grave danger here is that we must teach the emerging workforce to use the
>solid technologies that are in use in industry in good designs, so that those
>technologies will continue to be supported. We can allow the technologies
>that have fallen out of use, in favor of clearly better ones, to die. But we
>should not allow pursuit of fads to destroy the future support for viable
>technologies that are in use. The software industry has 60 years of
>experience. If the new workforce only knows about the last 10, we have a
>serious education problem. This industry desperately needs to examine new
>technologies in the light of older technologies and ask what is really
>different and how that is better; otherwise it spends a lot of time and money
>relearning the same lessons.
>
> In so many words, RDF and RDBMS are closely related technologies, and neither
>SQL nor RDF is the solution to any problem. They are tools, and a good
>workman will figure out which to use and how, when dealing with a specific
>problem.
>
> -Ed
>
> (My blog of the week...)
>
>
> --
> Edward J. Barkmeyer Email: edbark@xxxxxxxx
> National Institute of Standards & Technology
> Systems Integration Division
> 100 Bureau Drive, Stop 8263 Work: +1 301-975-3528
> Gaithersburg, MD 20899-8263 Mobile: +1 240-672-5800
>
> "The opinions expressed above do not reflect consensus of NIST,
> and have not been reviewed by any Government authority." (05)
--
++ Michael Brunnbauer
++ netEstate GmbH
++ Geisenhausener Straße 11a
++ 81379 München
++ Tel +49 89 32 19 77 80
++ Fax +49 89 32 19 77 89
++ E-Mail brunni@xxxxxxxxxxxx
++ http://www.netestate.de/
++
++ Sitz: München, HRB Nr.142452 (Handelsregister B München)
++ USt-IdNr. DE221033342
++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel (06)
pgp7yTh2264ck.pgp
Description: PGP signature
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (01)
|