[Top] [All Lists]

Re: [ontolog-forum] Simple Glossary of Data related Terms

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Rich Cooper" <rich@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 4 Jan 2014 12:51:21 -0800
Message-id: <6DECD1B9BCC04BBCB5BA2B0DD435A085@Gateway>

Yes.  For example, the “gender” property values could be M, F, L, G, B, T, and I saw an article the other day about a man who was born with two penises.  Then again, there are hermaphrodites born occasionally. 


It’s very difficult to fully understand the entire context around even relatively simple properties that most people don’t bother to think about. 





Rich Cooper


Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Hans Polzer
Sent: Saturday, January 04, 2014 12:26 PM
To: '[ontolog-forum] '
Subject: Re: [ontolog-forum] Simple Glossary of Data related Terms


Kingsley, Rich:


I’d suggest adding something in the definition of data that addresses the issue of scope and frames of reference that are usually implicit in the representation of data. The inclusion of the term “big data”  in the glossary and the discussion of boundaries and open-ness underscores this point. Data is about something or a portion of something and not everything, i.e., it has scope – unfortunately not usually explicitly defined. It also has one or more frames of reference in which it is represented, such as character sets, numbering systems, units of measure, naming conventions and namespaces, physical/spatial environmental assumptions, socio-political norms/perspectives, etc., as well as the notion of language already cited in the definition. For example, what are the correct, allowable, values for data elements such as sex, gender, sexual orientation, or political affiliation? I realize this forum doesn’t care much for the notion of context and its scope, but most data I have run across in my career in information systems carries with it all sorts of context and scope assumptions, and interpreting the data properly for whatever purposes someone might have requires an understanding to those context/scope assumptions and how they might relate (or not) to the corresponding assumptions appropriate to the purposes of those seeking/accessing/viewing the data in question.


Linked data provides links for one such set of scope/context assumptions, determined by the link creator. But data can be linked in a multiplicity of ways for a multiplicity of purposes – following multiple ontologies and related operational or institutional domains and their respective scope and underlying perspectives, frames of reference, and purposes for representing/amassing data in the first place.




From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Kingsley Idehen
Sent: Saturday, January 04, 2014 1:37 PM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] Simple Glossary of Data related Terms


On 1/4/14 12:50 PM, Rich Cooper wrote:

Dear Kingsley,


The definition you offered for “big data”:


“Data that's disparately located, varied in structure, voluminous, and rapidly changing.


doesn’t fit what most uses of that word seem to imply.  Businesses maintain big data history files which they mine for discovering knowledge.  But normally, that big data is stored in a data center on the local area network (not on the internet per se) to protect it from outside eyes.  Your definition emphasizes the internet, domains on it having lots of data which can be linked.  Most big data is not really linked – it comes from SQL and NoSQL databases that were captured in the business. 

I am not insinuating that "Big Data" is linked, quite the contrary. My claim is that "Big Data" is a term that refers to:

“Data that's disparately located, varied in structure, voluminous, and rapidly changing.


For example, many store chains keep records of how customers visit retail aisles, how much time they spend at each section, and what they finally buy before leaving.  The stores may forward the data to HQ over the internet, but it is protected by encryption, VPNs, etc to keep it from prying eyes.  After it reaches HQ, it is stored and mined.  


So my suggestion is to differentiate “Linked Big Data” from the usual “Big Data”.  That way, you can distinguish which kind is being described. 

Yes, which is why I treat "Big Data" as a term that's distinct from "Linked Data", "Linked Open Data", and the "Linked Open Data Cloud" in this document.

Happy New Year,
Kingsley Idehen       
Founder & CEO 
OpenLink Software     
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>