ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] What is Data? What is a Datum?

To: "doug@xxxxxxxxxx" <doug@xxxxxxxxxx>, "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Wed, 9 Jan 2013 18:15:32 -0500
Message-id: <50EDFA14.2000407@xxxxxxxx>

On 1/9/2013 4:42 PM, doug foxvog wrote:
> <snip>
> I note that databases traditionally encode high arity relations.  Very often
> a single column does not relate the value in that column for a row to the
> thing represented by the key of the row.  It takes multiple columns to do
> so.    (01)

Database theory refers to different degrees of "relatedness" in a 
relational table as "normal forms".  The most common form one sees is 
"third normal form":  each row is a set of facts about one individual.  
Some set of columns identifies the individual and each of the properties 
is represented by one or more other columns.  This form is the one 
commonly identified as "best practice".    (02)

A "fourth normal form" relation represents an atomic n-ary semantic 
relationship involving one or more individuals.  That is, each row of a 
4th normal form table represents exactly one atomic proposition.  Each 
individual is represented by a set of 1 or more columns whose values 
together identify a unique individual.    (03)

One of my favorite manufacturing examples is:  Machine type M requires T 
time to perform operation P on material S.    (04)

One can consider that to be a quaternary relationship, representing a 
function that takes three arguments (Machine type, operation, material) 
and produces a Time value.    (05)

But one might also consider it to be a binary relationship representing 
the same function, seen as having one argument that is an "instantiated 
operation", where the instantiated operation object is identified by the 
machine type, the operation, and the material. From the 4th normal form 
point of view, those are different descriptions of the same table and 
they don't really change the interpretation of a row.    (06)

The table, BTW, may have more than 4 columns, because a "machine type" 
or a "material" might have a "composite key" that takes more than one 
value (column) to represent, e.g., manufacturer and manufacturer's 
product id.  So the binary interpretation of the table is that the 
columns making up the identifiers for machine type, operation, and 
material constitute one big "composite key" for the "instantiated 
operation".    (07)

By comparison, in 5th normal form, there is exactly one column for each 
participating individual in an instance of the represented 
relationship.  The idea is to eliminate any ambiguity in the 
interpretation of combinations of key columns.  In that form, the binary 
interpretation will have exactly two columns, while the quaternary 
interpretation will have four.  Further, many experts would agree that, 
because the binary interpretation is a function, the binary table is the 
proper 5th normal form.    (08)

The point of all this is that, although database theory speaks of 
relational algebras, keys, functions and relations, the "normal form" 
ideas are about the relationship between tables and propositions.    (09)

As Doug says:    (010)

> Both the high-arity relation and the data base provide foundations for such
> propositions.    (011)

I completely agree with the following as well, but that isn't the point 
of this email.
> I find arguments for basing semantics on triples to be similar to arguments
> for basing arithmetic computations on Peano arithmetic, due to its
> providing a foundation for arithmetic.  A system based on Peano arithmetic
> has no need for a system to encode addition, subtraction, or multiplication,
> or their associated tables.  Sure, all that can be derived, but the
> efficiency
> leaves something to be desired, and the clarity of operations (reasoning)
> is hidden.  I find that the same holds for restricting the encoding of
> semantics to triples.
>
> -- doug f    (012)

-Ed    (013)

-- 
Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Systems Integration Division, Engineering Laboratory
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                Cel: +1 240-672-5800    (014)

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."    (015)



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (016)

<Prev in Thread] Current Thread [Next in Thread>