ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Undefined Data: A Survey Of Big Data Definitions

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Gary Berg-Cross <gbergcross@xxxxxxxxx>
Date: Thu, 3 Oct 2013 14:41:33 -0400
Message-id: <CAMhe4f1do0uCTuUxDH_jLO5V3AsL0jE7Tg7sbpZw0V=qkP=UuQ@xxxxxxxxxxxxxx>
I would be happy to even see a good definition of "data set" vs "data collection" within some part hierarchy with a smallest unit say of "data element."

Gary Berg-Cross, Ph.D.  
NSF INTEROP Project  
SOCoP Executive Secretary
Knowledge Strategies    
Potomac, MD
240-426-0770


On Thu, Oct 3, 2013 at 11:58 AM, John F Sowa <sowa@xxxxxxxxxxx> wrote:
Ali,

Thanks for the pointer:

> http://arxiv.org/abs/1309.5821
>
> Undefined By Data: A Survey of Big Data Definitions

Observation:  Some terms that do *not* occur in that paper:
semantic(s), ontology, RDF, OWL, linked data, LOD.

John
_______________________________________________________________________

Excerpts from http://arxiv.org/abs/1309.5821

Microsoft provides a notably succinct definition: “Big data is the term
increasingly used to describe the process of applying serious computing
power - the latest in machine learning and artificial intelligence - to
seriously massive and often highly complex sets of information” ...

The Method for an Integrated Knowledge Environment (MIKE2.0) project,
frequently cited in the open source community, introduces a potentially
contradictory idea: “Big Data can be very small and not all large
datasets are big” [1]. This is an argument in favour of complexity and
not size as the dominant factor...

This idea is supported the NIST definition which states that big data
is data which: “exceed(s) the capacity or capability of current or
conventional methods and systems” [3]. Given the constantly advancing
nature of computer science this definition is not as valuable as it may
initially appear. The assertion that big data is data that challenges
current paradigms and practices is nothing new...

Notably all definitions make at least one of the following assertions:

Size: the volume of the datasets is a critical factor.

Complexity: the structure, behaviour and permutations of the datasets
is a critical factor.

Technologies: the tools and techniques which are used to process a
sizable or complex dataset is a critical factor.

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>