[Top] [All Lists]

Re: [ontolog-forum] ONTOLOG community event planning and scheduling sess

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Michael Brunnbauer <brunni@xxxxxxxxxxxx>
Date: Fri, 13 Sep 2013 10:23:31 +0200
Message-id: <20130913082331.GA20126@xxxxxxxxxxxx>

Hello Ed,    (01)

I agree wholeheartedly. RDF and SPARQL make data integration easier (without
solving the fundamental issues of course). But they are a bad option for data 
storage because maintaining consistency is so difficult (think about deleting
a row or transactions). There should be a big warning sign above the SPARQL 
UPDATE standard for those who think relational databases are legacy. But I 
think nobody in the Semantic Web video or on this list actually said that ?    (02)

Regards,    (03)

Michael Brunnbauer    (04)

On Thu, Sep 12, 2013 at 04:27:49PM +0000, Barkmeyer, Edward J wrote:
> First of all, I agree strongly with John Sowa (for a change).  "Extended 
>RDBMS" are what the world runs on, and they have survived two fad replacement 
>technologies.  RDF will simply be the third.  As John says, RDF can be a data 
>representation at the interface to a data repository, but it provides no 
>fundamentally different value.  SPARQL is just another query language, and a 
>good "extended RDBMS" can run SPARQL queries.
> The children who think they are reinventing data access with RDF triple 
>stores need to understand that they are comparing a 5th normal form relational 
>database with a 3rd normal form relational database, and  5th normal form is a 
>more restricted specialization of 3rd normal form.  (Restating the formal 
>mathematical definitions) In 3NF, the table usually names a class, some set of 
>columns identifies a subject that is an instance of the class, and other sets 
>of one or more columns state individual facts about that subject.  In 5th 
>normal form, the table usually names a property,  one column designates the 
>subject, and the other column, if any, designates a value of the property for 
>that subject.  RDF is 5th normal form.  The advantage of 5th normal form is 
>that it facilitates joins in multidatabases, which is precisely the intent of 
>its use in the erstwhile "Semantic Web" and "Linked Open Data".  The 
>disadvantage of 5th normal form is that you have to do a lot of joins to 
>answer simple queries, and joins are expensive in large databases.  As 
>Stonebraker and others in the mid-1980s pointed out, it is useful to convert 
>selected 3rd normal form tables to 5th normal form for query-specific 
>multidatabase joins ("distributed queries"), in order to deal with the problem 
>of "partitioning" in multidatabases (an individual database can have some of 
>the facts about a given set of things, or all of the facts about a subset of 
>the things, and multiple database can overlap in both ways).  In 
>latest-and-greatest terms that is to say, it is useful to convert selected 
>database rows to RDF triples in order to answer certain queries.
> On the other hand, RDF is particularly clumsy for dealing with data that is 
>best represented in 4th normal form, such as a statement of a quantified 
>property.  (In 4NF, a row states one fact, but the identifiers for the subject 
>and the object can be multiple columns.)  The 'quantity' object is represented 
>as two 'columns': number and unit.  In RDF/5NF the quantity becomes a database 
>key (ooh IRI, but ad hoc) with two more 'assertions' that relate it to a 
>number and a unit.  (Since engineering is what my division does, this is 
>important to us.)
> As John says, the right way forward is to see RDF as a standard 
>representation for 5th normal form relational rows, and SPARQL as a query 
>language that augments the capabilities of SQL (not as a replacement for it).  
>The real problem that neither solves is to get agreement on vocabulary and on 
>the interpretation of individual data.
> Now, as to persons in industry who think their relational database systems 
>are "legacies", some of them are right.  Others are why the IT industry is 
>permitted to waste billions of dollars/euros/yen on fad technologies and 
>repainting of old ideas.
> What makes a system a "legacy" is not the technology used, unless that 
>technology is no longer supported, but rather its relevance to the way you 
>currently do business.  There were a lot more database designers in the 1980s 
>and 1990s than there were competent modelers who could build properly 
>extensible conceptual schemas.  So, a lot of the purpose-built databases were 
>brittle designs at the outset and have become as much a part of the problem as 
>the business practice moved on.  It is poor fault analysis to say this is a 
>consequence of the technology, without determining that the fault was not in 
>the design.  ("A poor workman blames his tools.")   (If your database and your 
>processing software assume that all of your products will be sold in barrels, 
>by volume, and you subsequently get into the business of synthetic fibre, 
>which is sold in spools, by length, is your legacy problem the fact that you 
>used an RDBMS?)
> Doing a new poor design with new technology just creates more expense and a 
>new legacy system with a shorter lifecycle.  This is particularly the case 
>when the would-be designers have little prior experience in what makes a good 
>and flexible model, in the mistaken belief that their new technology will make 
>up for their personal incompetence.  (I have seen a lot of weak or downright 
>bad OWL models, and the problem is not in the technology.)  Database design, 
>whatever the target technology, is a SKILL.  You have to learn the skill, and 
>it involves understanding the technology, becoming familiar with the business 
>problem space and the intended usages, and the learning the art of 
>"abstraction", the "art of design".  In most failed database projects, the 
>devil is not in the details, but rather in the overall conceptualization, or 
>the lack of one.
> The grave danger here is that we must teach the emerging workforce to use the 
>solid technologies that are in use in industry in good designs, so that those 
>technologies will continue to be supported.  We can allow the technologies 
>that have fallen out of use, in favor of clearly better ones, to die.  But we 
>should not allow pursuit of fads to destroy the future support for viable 
>technologies that are in use.  The software industry has 60 years of 
>experience.  If the new workforce only knows about the last 10, we have a 
>serious education problem.  This industry desperately needs to examine new 
>technologies in the light of older technologies and ask what is really 
>different and how that is better; otherwise it spends a lot of time and money 
>relearning the same lessons.
> In so many words, RDF and RDBMS are closely related technologies, and neither 
>SQL nor RDF is the solution to any problem.  They are tools, and a good 
>workman will figure out which to use and how, when dealing with a specific 
> -Ed
> (My blog of the week...)
> --
> Edward J. Barkmeyer                     Email: edbark@xxxxxxxx
> National Institute of Standards & Technology
> Systems Integration Division
> 100 Bureau Drive, Stop 8263             Work:   +1 301-975-3528
> Gaithersburg, MD 20899-8263             Mobile: +1 240-672-5800
> "The opinions expressed above do not reflect consensus of NIST,
>  and have not been reviewed by any Government authority."    (05)

++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail brunni@xxxxxxxxxxxx
++  http://www.netestate.de/
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel    (06)

Attachment: pgp7yTh2264ck.pgp
Description: PGP signature

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>