On 8/29/14 5:27 PM,
Hans Polzer wrote:
Patrik,
One major issue with Big Data (and any data, for that
matter) is the issue of the scope of the data sets,
which is usually left implicit and often inferred from
external knowledge about the data source. By scope of
the data sets I mean what portion of reality does the
data set purport to represent. Sometimes the complexity
of the data representation in a data set is due to
explicit inclusion of scope information, but usually
scope is left unspecified in the data representation.
For example, if one is trying to determine air traffic
patterns from data sets provided by the various
national/regional air traffic authorities or airlines,
aside from all the differences in representation and
complexity of such data by the different sources, one
has to determine what portion of the overall air traffic
is captured by the aggregate sources one has access to,
and whether there is any overlap among the sources (and
what the nature of the overlap might signify with
respect to one's objective for accessing the data
sources). Do some of the sources include general
aviation traffic or only scheduled commercial
(passenger?) traffic. What portion of the world's air
traffic (of a particular set of types) do we not have
data sources for? Are the time ranges of the data
sources compatible with the data access objectives?
Does a particular source include military aircraft
traffic? Does it include charters. Does it include
Government executive aircraft?
What about helicopter traffic or lighter than air
traffic or UAVs? Up and down to what vehicle size
ranges? What about sub-orbital or orbital traffic (even
if one excludes "space" traffic as not being "air"
traffic, space and orbital traffic typically traverses
the atmosphere when launched and often returns through
the atmosphere)? Are hovercraft considered air traffic?
What about gliders, paragliders, and "airsuits", or are
we only interested in powered aircraft or fuel-burning
aircraft (not all powered craft burn fuel)? Note that
there are fuel-burning paragliders. Are rockets/missiles
and artillery considered "air" traffic?
When one accesses Big Data for some purpose, what has
one really accessed?
Data distorted by qualification using a meaningless
buzz-phrase
How big
is "Big"?
Reinforcing my comment above.
More
importantly, how big a portion of what one is looking
for does Big Data represent?
Ditto.
And
what can one safely conclude for the purposes at hand,
given that scope information (assuming it is available
or can be inferred)? I'm not sure this is totally a
question of logic.
Correct, it is inevitably illogical. Thanks to the
meaningless nature of a classic marketing buzz-phrase [1].
## Nanotation Start ##
<https://twitter.com/hashtag/Buzzword#this>
a skos:Concept ;
owl:sameAs <http://dbpedia.org/resource/Buzzword>
;
is foaf:primaryTopic of <http://dbpedia.org/describe/?url="">
.
## Nanotation End ##
Links:
[1] https://twitter.com/kidehen/status/492757872816971776
-- What is a buzzword or buzz-phrase ?
Hans Polzer
--
Regards,
Kingsley Idehen
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this