ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] [External] Re: What is Data? What is a Datum? 2013-0

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: William Frank <williamf.frank@xxxxxxxxx>
Date: Fri, 11 Jan 2013 15:46:16 -0500
Message-id: <CALuUwtAn341bJx4ZbxeJMW1AM6LQJbjYRPqTaeBN9XvhLKQTBw@xxxxxxxxxxxxxx>


On Fri, Jan 11, 2013 at 1:20 PM, Burkett, William [USA] <burkett_william@xxxxxxx> wrote:
Thanks for "mark token type", John.

My primary reason for jumping into this discussion is to submit/contribute/offer some simple - and in my mind, clear and unambiguous - definitions for data and information that I've found useful and have worked for me.   Pierce's triad fits my definitions/uses perfectly:

Data is always a physical mark.

Well, then, following pierce, kant, and all the rest,  the number three hundred and sixty five would not be  data.

Only some perhaps black areas on a particular white background would be wb;data.
 

Information is a token - the meaning interpreted from data/marks or encoded in/by the articulation of data/marks (the intangible stuff "put into" or "derived from" data)

In other words, the number three hundred and sixty five would be information.  This is not plausable.  Values of what others call data types, such as the natural numbers, are what most think of as data.  This discussion mostly has said no, only when you know the statement in which the number is used, do you have a datum.    Because by the time we are doing data processing, we know what those bits in the computer represent, a number,  a character, a truth value, a color .... 
 
Knowledge is a compendium (so to speak) of types that only exists in human minds.  It's used to ascertain the information/token when perceiving or creating data/marks.

So, the statement that there are 365 days in a year  would be knowledge.

I have been wondering what the IT consulting tribe has meant by this for more than 20 years.

This is very helful to understand, but I am not sure is so helpful to apply.
 
So, IM-ever-so-HO, terms like "knowledge representation"/"ontology" and "reasoning" are just high-falutin' names for "carefully constructed data models" and "clever data processing", respectively.

I realize that the positions/understandings of others in this forum different significantly from this, and offer my position with a bit of trepidation (as an occasional and under-qualified participant in this forum :-)) and a bit of tongue-in-cheek thought-provocation.

 

Bill





_________________

William C. Burkett   Associate

Booz | Allen | Hamilton

121 S Tejon St # 900 | Suite 900 South Tower | Colorado Springs, CO, 80903

T: 719-387-6452 | M: 310-318-5500 | F: 719-387-2020

burkett_william@xxxxxxx


-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F Sowa
Sent: Friday, January 11, 2013 10:15 AM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] [External] Re: What is Data? What is a Datum? 2013-01-09-0930

Bill, Dan, Doug, and Ed,

I agree with all of you about the need for clear definitions and systematically related systems of definitions.

WB
> I generally support the position Dan Gilman puts forth.  Data is
> something physical and observable (a point not emphasized in this
> discussion) and /may/ have meaning associated with it.

DF
> This is a novel property of the term.  I suggest using a clearly
> different term for such a meaning...

I agree that we need better terms.  But they must be (a) easy to remember, (b) easy to use precisely, and (c) systematically related to terminology commonly used in science and engineering.

With his theory of signs, C. S. Peirce was precise and systematic.
His most primitive triad (two-thirds of which is widely used) is

    Mark, Token, Type

Every sign consists of an observable, but not yet interpreted mark.
Every interpretation classifies that mark as a token of some type, but the number of different types for the same mark is open ended.

WB
> Consider data collected by radio telescopes ... does it have meaning?
> Maybe.  Words on a page are data; a speech that you listen to at a
> conference is data.

More generally, everything we see, hear, feel, smell, or taste is a mark.  Every perception is an interpretation of marks, but it may be an optical illusion, a reflection, a recording, or a deliberately generated illusion for benign or sinister purposes.

WB
> data is physical thing, usually imbued with meaning by our
> articulation and interpretation processes (which may be embodied by
> the software we write.)

That's why we must always consider the full triad:  mark, token, type.
The mark is always physical.  The token is always an interpretation, and the type (of which there may be many) is one of many meanings.

EB
> We disagree on the definition of 'datum'.  Your definition makes it a
> synonym for the ISO 1087 term "designation" (at least as formally
> recast in the OMG SBVR specification, with the assistance of ISO TC37
> experts).

I certainly agree with the need for standardized terminology.  But I'd recommend Peirce's term 'mark' instead of 'designation'.  The two terms 'type' and 'token' are already widely known and used.  The word 'mark'
is a short, simple word whose technical sense in Peirce's triad is one of its most common uses.

EB
> I note carefully that ISO 11404 defines "equality" on "datatypes". It
> certainly does not define "equality" on "concepts", which is way
> beyond its scope.  Further ISO 11404 defines a "datatype" to have a
> "value space", which, for a concept would be its extension, or for the
> sign, its denotation.  I don't think ISO 11404 defines "datum", nor
> does it define the relationship between "datatype" and "datum"...

When the experts can't agree, the probability of getting anybody else to adopt and use their terms precisely is vanishingly small.

DG
>> If a datum is only a meaning, then what distinguishes it from information?
>> And why do we have a representation for it?

EB
> I simply don't understand these questions.  I doubt that we agree on
> the definition of "information"; and it seems to me that we have to
> have representations for meanings in order to convey our intent to others.

I strongly recommend Peirce's triad of mark, token and type as a basis for analyzing, defining, and relating all this terminology.

EB
> In my view, a table for which you don't know how to interpret a row is
> not 'data'; it is just an image.  It might as well be a JPEG of a
> drawing.

The general term for tables, JPEGs, or anything else that might be stored in a computer is 'sign'.  If you don't know how to interpret it, you can just call it a 'mark'.  The word 'image' is already a simple interpretation, which a computer could infer from the ".jpg" part of the file name.  But the number and kinds of detailed interpretations of an image is enormous.

For a very brief intro to Peirce's semiotics, see Section 2 (pp 3 to 9) of http://www.jfsowa.com/pubs/rolelog.pdf

I doubt that ISO or any other standards body is likely to adopt Peirce's full system.  But the triad of mark, token, and type can be used to define other terminology more precisely.

John



--
William Frank

413/376-8167


This email is confidential and proprietary, intended for its addressees only.
It may not be distributed to non-addressees, nor its contents divulged,
without the permission of the sender.

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>