ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] [External] Re: What is Data? What is a Datum? 2013-0

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: John F Sowa <sowa@xxxxxxxxxxx>
Date: Fri, 11 Jan 2013 12:15:23 -0500
Message-id: <50F048AB.1030803@xxxxxxxxxxx>
Bill, Dan, Doug, and Ed,    (01)

I agree with all of you about the need for clear definitions and
systematically related systems of definitions.    (02)

WB
> I generally support the position Dan Gilman puts forth.  Data is
> something physical and observable (a point not emphasized in this
> discussion) and /may/ have meaning associated with it.    (03)

DF
> This is a novel property of the term.  I suggest using a clearly
> different term for such a meaning...    (04)

I agree that we need better terms.  But they must be (a) easy to
remember, (b) easy to use precisely, and (c) systematically related
to terminology commonly used in science and engineering.    (05)

With his theory of signs, C. S. Peirce was precise and systematic.
His most primitive triad (two-thirds of which is widely used) is    (06)

    Mark, Token, Type    (07)

Every sign consists of an observable, but not yet interpreted mark.
Every interpretation classifies that mark as a token of some type,
but the number of different types for the same mark is open ended.    (08)

WB
> Consider data collected by radio telescopes … does it have meaning?
> Maybe.  Words on a page are data; a speech that you listen to at
> a conference is data.    (09)

More generally, everything we see, hear, feel, smell, or taste
is a mark.  Every perception is an interpretation of marks, but
it may be an optical illusion, a reflection, a recording, or a
deliberately generated illusion for benign or sinister purposes.    (010)

WB
> data is physical thing, usually imbued with meaning by our
> articulation and interpretation processes (which may be embodied
> by the software we write.)    (011)

That's why we must always consider the full triad:  mark, token, type.
The mark is always physical.  The token is always an interpretation,
and the type (of which there may be many) is one of many meanings.    (012)

EB
> We disagree on the definition of 'datum'.  Your definition makes it
> a synonym for the ISO 1087 term "designation" (at least as formally
> recast in the OMG SBVR specification, with the assistance of ISO TC37
> experts).    (013)

I certainly agree with the need for standardized terminology.  But I'd
recommend Peirce's term 'mark' instead of 'designation'.  The two terms
'type' and 'token' are already widely known and used.  The word 'mark'
is a short, simple word whose technical sense in Peirce's triad is one
of its most common uses.    (014)

EB
> I note carefully that ISO 11404 defines "equality" on "datatypes". It
> certainly does not define "equality" on "concepts", which is way beyond
> its scope.  Further ISO 11404 defines a "datatype" to have a "value
> space", which, for a concept would be its extension, or for the sign,
> its denotation.  I don't think ISO 11404 defines "datum", nor does it
> define the relationship between "datatype" and "datum"...    (015)

When the experts can't agree, the probability of getting anybody else
to adopt and use their terms precisely is vanishingly small.    (016)

DG
>> If a datum is only a meaning, then what distinguishes it from information?
>> And why do we have a representation for it?    (017)

EB
> I simply don't understand these questions.  I doubt that we agree on the
> definition of "information"; and it seems to me that we have to have
> representations for meanings in order to convey our intent to others.    (018)

I strongly recommend Peirce's triad of mark, token and type as a basis
for analyzing, defining, and relating all this terminology.    (019)

EB
> In my view, a table for which you don't know how to interpret a row is
> not 'data'; it is just an image.  It might as well be a JPEG of a
> drawing.    (020)

The general term for tables, JPEGs, or anything else that might be
stored in a computer is 'sign'.  If you don't know how to interpret it,
you can just call it a 'mark'.  The word 'image' is already a simple
interpretation, which a computer could infer from the ".jpg" part
of the file name.  But the number and kinds of detailed interpretations
of an image is enormous.    (021)

For a very brief intro to Peirce's semiotics, see Section 2 (pp 3 to 9)
of http://www.jfsowa.com/pubs/rolelog.pdf    (022)

I doubt that ISO or any other standards body is likely to adopt
Peirce's full system.  But the triad of mark, token, and type
can be used to define other terminology more precisely.    (023)

John    (024)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (025)

<Prev in Thread] Current Thread [Next in Thread>